x86: Check XMM destination for VPBROADCASTQ optimization
The assembler now correctly optimizes 128-bit VPBROADCASTQ instructions with XMM destinations.
The assembler’s optimization for 128-bit VPBROADCASTQ to VPUNPCKLQDQ transformation was incorrectly applied to instructions with YMM destinations, leading to invalid code. This commit adds a check to ensure the destination operand is an XMM register before applying the optimization, fixing the code generation error. The assembler will now generate correct code for VPBROADCASTQ instructions with both XMM and YMM destinations.
In Details
The commit fixes a regression introduced by commit eb4031cb20aa710834be891f8638e04dbba81edc, which optimized vpbroadcastq %xmmN, %xmmM to vpunpcklqdq %xmmN, %xmmN, %xmmM (N < 8). The regression occurred because the destination operand wasn't checked to be an XMM register. The fix adds this check in optimize_encoding within gas/config/tc-i386.c. The testsuite is updated to include a 256-bit vpbroadcastq instruction.
For Context
The GNU Assembler (gas) optimizes certain x86 vector instructions to improve performance. A recent optimization for the VPBROADCASTQ instruction, which broadcasts a value to a vector register, was incorrectly applied to 256-bit YMM registers when it was only intended for 128-bit XMM registers, producing incorrect code. This commit fixes the assembler to only apply the optimization when the destination is an XMM register, ensuring correct code generation for both XMM and YMM registers.