GCC Newspaper
JUNE 15, 2026
Date
/
Architectures
Components
Topics
News & Policy
Other
aarch64 Performance Win

Aarch64: Add support for range prefetch intrinsic

GCC now supports new AArch64 intrinsics for range prefetching, allowing better memory access optimization.

This commit adds support for new AArch64 intrinsics, __pld_range and __pldx_range, which enable range prefetching. These intrinsics allow developers to tell the processor to prefetch a specified range of memory into the cache, potentially improving performance by reducing memory access latency. The change includes adding new builtins, expanding their functionality, and defining new unspec operations in the aarch64.md file, along with a new macro __ARM_PREFETCH_RANGE for conditional compilation.

In Details

This change introduces _pld_range and _pldx_range intrinsics for AArch64, augmenting the existing prefetch capabilities by allowing a specified memory range to be hinted for preloading into the cache. This involves extending the aarch64_builtins enum and aarch64_init_prefetch_builtins for registration, defining UNSPEC_PLDX_RANGE and UNSPEC_PLD_RANGE in aarch64.md for instruction generation, and implementing aarch64_expand_prefetch_range_builtin in aarch64-builtins.cc for code generation. The aarch64_update_cpp_builtins in aarch64-c.cc also adds __ARM_PREFETCH_RANGE, en…

For Context

This update to GCC (the GNU Compiler Collection) adds new capabilities for AArch64 processors, which are commonly found in mobile devices and servers. Processors have a small, very fast memory called a 'cache' to store data they expect to use soon. 'Prefetching' is a technique where the processor tries to guess what data you'll need next and loads it into the cache ahead of time, which can significantly speed up your program by reducing delays waiting for data from main memory. This change introduces new special functions, called 'intrinsics,' that allow programmers to specifically tell the AArch64 processor to prefetch a *range* of memory. This extra control helps developers fine-tune their code for better performance, especially in applications that frequently access large blocks of data.

Filed Under: aarch64performanceintrinsicsoptimizationmemory