Match.pd: Allow FNMA fold through conversions.
GCC can now fold FMA operations into FNMA even when type conversions are present, improving AArch64 SVE code generation.
GCC’s pattern matching for FMA (fused multiply-add) operations can now handle type conversions, enabling the generation of FNMA (fused multiply-negate-add) instructions on AArch64 SVE. Previously, FMA folds in match.pd only matched direct negations, preventing optimizations when the negated operand was wrapped in a type conversion. This change allows the compiler to recognize the subtraction-of-product form even with vector element type casts, leading to more efficient code generation on AArch64 SVE by selecting msb/mls instructions instead of separate neg and mla/mad instructions.
In Details
This commit modifies match.pd to allow conversions on the negated operand and the second multiplicand in FMA folds, enabling the use of IFN_FNMA. The pattern is restricted to nop_convert on the negated operand to avoid folding through value-changing conversions. The fold is only performed when signed overflow is unobservable for both the outer FMA operation and the inner negation. This change addresses PR target/123924 and improves code generation for AArch64 SVE by enabling the selection of msb/mls instructions.
For Context
Fused Multiply-Add (FMA) is an instruction that performs both a multiplication and an addition in a single step, often resulting in faster and more accurate computations. This commit enables GCC to recognize more patterns that can be optimized into FMA instructions, specifically Focused Multiply-Negate-Add (FNMA) instructions. These optimizations are particularly important for AArch64 processors with Scalable Vector Extension (SVE), where using FNMA can significantly improve performance by using msb and mls instructions directly instead of synthesizing them.