Thanks for the help investigating, both of you! > It seems to be a regression between gcc 10 and gcc 11 (discovered by > changing the compiler on godbolt.org). With gcc 11 onwards, the > compiler seems to be using the stack to combine two 4-byte elements at a > time into a single 8-byte element. It's easy to see the effect by > changing the loop size to 2. I gave this a bisect, and it looks like the commit that caused it is this one: commit 33c0f246f799b7403171e97f31276a8feddd05c9 Author: Richard Biener <rguenther@xxxxxxx> Date: Fri Oct 30 11:26:18 2020 +0100 tree-optimization/97626 - handle SCCs properly in SLP stmt analysis The details of how this ties back to the behavior we're seeing go far above my head, unfortunately. I'll go ahead and file a bug on the tracker. Thanks again, Gaelan PS Apologies for the horrendous wrapping in my previous email! I'm using webmail because my normal machine is in repair, and list etiquette totally slipped my mind.