GCC Newspaper
JUNE 15, 2026
Date
/
Architectures
Components
Topics
News & Policy
Other
gcc/x86 Performance Win

Corrected condition for last_4x_vec_label in ix86_expand_movmem

Fixes a condition that prevented the generation of a label for certain memory move sizes, affecting inlined memmove performance.

The condition for generating last_4x_vec_label in ix86_expand_movmem was incorrect, leading to missed optimization opportunities when inlining memmove for specific sizes. The condition min_size < 4 * move_max was changed to min_size <= 4 * move_max to ensure the label is generated when min_size equals 4 * move_max. This can improve performance of inlined memmove in some cases. Several test cases were updated or added to validate the fix.

In Details

In ix86_expand_movmem in i386-expand.cc, the last_4x_vec_label is used to handle memory moves of certain sizes when inlining memmove. The condition that controls the emission of this label impacts code generation for overlapping unaligned loads and stores. A missed optimization in this area can lead to suboptimal code sequences for memmove, particularly when MOVE_MAX is 16 bytes and min_size is 64. The fix ensures correct code generation in these scenarios.

For Context

When compilers translate source code, they often replace standard library functions like memmove with optimized, inline versions for better performance. This optimization involves generating specific assembly code sequences based on the size of the memory region being moved. Labels act as markers in the generated assembly code, guiding the flow of execution. This commit fixes a bug where a label wasn't being generated in some cases, preventing the compiler from using the optimal code sequence for certain memory move sizes.

Filed Under: x86optimizationmemmovecode generation