GCC Newspaper
JUNE 15, 2026
Date
/
Architectures
Components
Topics
News & Policy
Other
risc-v Performance Win

RISC-V: Use vnsrl instruction for even-odd shuffles.

GCC now uses vnsrl instruction in RISC-V for even-odd shuffles, potentially improving performance versus vcompress.

This change replaces the vcompress instruction with vnsrl for implementing even-odd shuffles on RISC-V. The vcompress instruction is slower, and requires a mask load, so using vnsrl should improve performance on many implementations. This optimization aligns with LLVM’s current behavior.

In Details

On RISC-V, vector compress (vcompress) typically requires a mask load and is often slower than vector narrow shift right logical (vnsrl). This commit modifies the shuffle_even_odd_patterns function in riscv-v.cc to emit vnsrl when possible. The change avoids using vcompress instruction, which could improve performance on hardware where vcompress is slow.

For Context

Vector instructions perform the same operation on multiple data elements simultaneously. Instead of processing each element individually, a single instruction can manipulate an entire vector. On RISC-V, the vector compress instruction (vcompress) selects elements from a vector based on a mask. Even-odd shuffles rearrange vector elements, interleaving even-indexed elements from one vector with odd-indexed elements from another. This commit optimizes even-odd shuffles by using vector narrow shift right logical (vnsrl) which is faster than vcompress in some cases.

Filed Under: risc-voptimizationvectorizationperformance