RFC: GCC support for masked compress stores and VPCOMPRESS-style codegen
Project / Subsystem
gcc / rfc
Date
2026-05-14
Proposer
Raghesh Aloor <Raghesh.Aloor@amd.com>
Source type
public_inbox
Consensus
Proposed
Sentiment
—/10
Technical tradeoffs
- • Increased compiler complexity to recognize and optimize the specific loop pattern.
- • Potential for increased code size if VPCOMPRESS is not profitable in all cases.
- • Requires careful tuning of heuristics to determine when VPCOMPRESS is beneficial.
- • May introduce new dependencies on AVX-512 instruction set.
All attributes
- project
- gcc
- subsystem
- rfc
- patch_id
- —
- discussion_id
- 20260514062146.3464666-1-Raghesh.Aloor@amd.com
- source_type
- public_inbox
- title
- RFC: GCC support for masked compress stores and VPCOMPRESS-style codegen
- headline
- RFC: GCC support for masked compress stores and VPCOMPRESS codegen
- tldr
- Proposes extending GCC's loop vectorizer to generate AVX-512 VPCOMPRESS instructions for predicated stores into a buffer.
- proposer
- Raghesh Aloor <Raghesh.Aloor@amd.com>
- consensus
- Proposed
- outcome
- proposed
- sentiment_score
- —
- technical_tradeoffs
-
- • Increased compiler complexity to recognize and optimize the specific loop pattern.
- • Potential for increased code size if VPCOMPRESS is not profitable in all cases.
- • Requires careful tuning of heuristics to determine when VPCOMPRESS is beneficial.
- • May introduce new dependencies on AVX-512 instruction set.
- series_id
- —
- series_role
- standalone
- series_parts
- []
- tags
-
- • gcc
- • loop vectorization
- • AVX-512
- • VPCOMPRESS
- • code generation
- bugzilla_url
- —
- date
- 2026-05-14T00:00:00.000Z
RFC: GCC support for masked compress stores and VPCOMPRESS-style codegen
This RFC proposes extending GCC’s loop vectorizer to recognize loops with predicate-guarded stores into a buffer with an offset incremented under the same predicate, enabling the backend to emit AVX-512 VPCOMPRESS instructions when profitable. This change addresses PR tree-optimization/91198, where GCC failed to generate AVX-512 compress/expand instructions in relevant cases. The initial prototype patch for phase 1 will be sent to the gcc-patches list soon, with the aim of improving performance on code that can benefit from masked compress stores.