GCC Newspaper
JUNE 15, 2026
Date
/
Architectures
Components
Topics
News & Policy
Other
i386 Performance Win

i386: Disable gather optimizations for Diamond Rapids

Gather optimizations are now disabled for Diamond Rapids processors to improve pipeline utilization.

On Diamond Rapids (DMR) architecture, gather emulation achieves optimal pipeline utilization and parallelism with 2/4-element vectors This commit disables the use_gather_2parts and use_gather_4parts optimizations for the Diamond Rapids architecture. This adjustment aims to improve performance on DMR by using more efficient gather implementations.

In Details

This commit modifies x86-tune.def to disable X86_TUNE_USE_GATHER_2PARTS and X86_TUNE_USE_GATHER_4PARTS for m_DIAMONDRAPIDS. The tuning definitions control code generation strategies for different x86 microarchitectures. The change suggests that the compiler's default gather implementations are more efficient than the 2/4-part versions on DMR.

For Context

The GCC compiler can generate different code sequences based on the specific type of x86 processor it is targeting. These CPU-specific optimizations are controlled by tuning definitions. This commit disables specific gather optimizations (use_gather_2parts and use_gather_4parts) for Diamond Rapids processors, implying that a different code sequence is more efficient for this particular microarchitecture.

Filed Under: i386optimizationdiamond rapidsgather