x86: Fix stack alignment for AVX arguments on Windows.
Fixes stack alignment issues for indirect AVX arguments and return values on Windows, ensuring correct memory access.
This patch resolves stack alignment issues on x86_64 Windows when using AVX instructions. The Windows ABI limits stack alignment to 128 bits, but 256-bit AVX values are passed and returned indirectly. This could lead to under-aligned stack slots and incorrect memory accesses. The fix involves dynamically allocating stack space for over-aligned arguments and return slots, and overallocating local stack slots to ensure proper alignment.
In Details
On x86_64-w64-mingw32, TARGET_SEH limits MAX_SUPPORTED_STACK_ALIGNMENT to 128 bits, but 256-bit AVX values are often passed and returned indirectly. Some caller/callee stack-slot paths still used generic allocators that cap requested alignment to MAX_SUPPORTED_STACK_ALIGNMENT, producing slots that are under-aligned for later vmovapd/vmovaps accesses. The patch introduces a target hook, overaligned_stack_slot_required to control when this over-aligned stack-slot handling is required.
For Context
The Application Binary Interface (ABI) defines how software components interact at the machine code level, including data types, calling conventions, and stack usage. AVX (Advanced Vector Extensions) are x86 instructions that operate on multiple data points simultaneously (SIMD), using 256-bit or 512-bit registers. Proper stack alignment is crucial for performance and correctness when using AVX instructions, as misaligned memory accesses can lead to crashes or data corruption. This patch ensures that the stack is properly aligned when passing AVX arguments and return values on Windows.