AArch64: Implement FMUL SME instruction.

This commit implements the FMUL SME instruction for AArch64, enhancing support for streaming floating-point multiplication.

By Artemiy Volkov January 16, 2026 committed

The commit introduces the implementation of the FMUL (floating-point multiply) instruction for the Scalable Matrix Extension (SME) on AArch64. It includes both multi-vector floating-point multiply by vector and multi-vector floating-point multiply variants. New intrinsics corresponding to these instructions have been added, enhancing the capabilities for streaming floating-point operations. This addition enables optimized code generation for SME-based computations.

In Details

This patch defines new SVE function variants in aarch64-sve-builtins-sve2.def for svmul. It introduces instruction patterns @aarch64_sve_<optab><mode> and @aarch64_sve_<optab><mode>_single in aarch64-sve2.md. It also defines TARGET_STREAMING_SME2p2 in aarch64.h. New tests are added in the gcc.target/aarch64/sme2/acle-asm/ directory, exercising the new intrinsics. These changes implement the FMUL SME instruction, contributing to the full implementation of the SME2.2 extension.

For Context

This commit adds support for a new instruction in the Scalable Matrix Extension (SME) for AArch64 processors. SME is designed to accelerate matrix operations, which are fundamental to many tasks in scientific computing, machine learning, and multimedia processing. The new FMUL instruction performs floating-point multiplication on vectors and matrices, allowing for faster and more efficient computation in these domains. By implementing this instruction, the compiler can now generate more optimized code for applications that use SME.

Filed Under: aarch64SMEFMULintrinsicsfloating-point

View Commit →