binutils Newspaper
JUNE 15, 2026
x86 Committed

x86: Check XMM destination when optimizing 128-bit VPBROADCASTQ

Fixes a bug where 256-bit VPBROADCASTQ instructions were incorrectly optimized to 128-bit VPUNPCKLQDQ instructions by checking for XMM destination.

H.J. Lu fixed a bug in the x86 assembler where the optimization of 128-bit VPBROADCASTQ instructions could lead to incorrect code generation when the destination operand was a YMM register. The original optimization, intended to replace vpbroadcastq %xmmN, %xmmM with vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8), failed to check if the destination operand was an XMM register. This resulted in 256-bit vpbroadcastq %xmmN, %ymmM being incorrectly transformed into vpunpcklqdq %xmmN, %xmmN, %xmmM. The fix adds a check for XMM destination to prevent this incorrect optimization.

In the Thread 1 participant
  1. H.J. Lu proposer

    Fixes a bug where 256-bit VPBROADCASTQ instructions were incorrectly optimized to 128-bit VPUNPCKLQDQ instructions by checking for XMM destination.

    “But it didn't check if the destination operand is XMM. As the result, it turned: vpbroadcastq %xmmN, %ymmM into vpunpcklqdq %xmmN, %xmmN, %xmmM”

In Details

This patch fixes an incorrect optimization in the x86 assembler (gas) related to the VPBROADCASTQ instruction. The optimization attempts to replace a VPBROADCASTQ instruction with a VPUNPCKLQDQ instruction under certain conditions. The bug occurs because the optimization doesn't properly check the destination operand size, leading to incorrect code generation when a YMM register is used as the destination.

For Context

The GNU Assembler (gas) is responsible for translating assembly code into machine code. During this process, the assembler may perform optimizations to improve the efficiency of the generated code. This patch addresses a bug in one such optimization for the VPBROADCASTQ instruction on x86 processors. The VPBROADCASTQ instruction is part of the AVX2 instruction set, which is used to perform operations on vectors of data. A bug in the assembler caused it to incorrectly optimize this instruction in some cases, leading to incorrect program behavior.

Filed Under: x86assembleroptimizationVPBROADCASTQAVX2