On Mon, Mar 17, 2025 at 11:38 AM Willy Tarreau <w@xxxxxx> wrote: > OK thanks, but that remains quite strange to me. How can we end up > here with such an unaligned stack ? At the very minimum I'd expect > all offsets to be multiple of 8. It's a peculiar feature of the version 9 SPARC architecture and runtime. This also ties into your window save area question. Let's start with these: * There are 16 save-able registers in a window. * Before V9, registers were 32 bits wide. * V9 and later, registers are 64 bits wide. * Each stack frame must provide an area for register data. Now 32 bits = 4 bytes, times 16 regs = 64 bytes. So for V8 and lower, the register save area is [%sp+0] through [%sp+63] inclusive. Now V9 comes along and we need 128 bytes. But we're going to run old V8 code in compatibility mode! How will we tell that some function f() is running in V8 mode instead of V9 mode? [footnote] Someone decided that the way to tell would be to use a deliberate weird alignment of the stack pointer. If the stack pointer was 7 mod 8, then we're in 64 bit V9 mode and [%sp+2047+0] through [%sp+2047+127] inclusive are the register save area. If not, it must be 0 mod 8 and we're in V8 mode and things are as before. Why 2047? Well, by observation, it's more common to need negative offsets from the stack pointer (for a large stack-area array for instance) than it is to need positive ones (register window save area and overflow function argument area beyond that). But the instruction set is more or less symmetric, with a 13-bit immediate constant offset of -4096 to +4095. Solution: add some offset to the stack pointer so that function-stack memory is [%sp-4096] through [%sp+2046], a 6 kilobyte range instead of a 4k one. The stack offset therefore helps solve both problems: the offset indicates whether to use V8 or V9 register dump conventions and, at the same time, increases the amount of easily-accessed stack memory. [footnote] This provides the ability to dynamically link V8 and V9 code together. As far as I know this was never used, so that a per process mode bit suffices just as well. Still, the offset went in. Chris