Re: Stack frame question on x86 code generation

Arturas Moskvinas <arturas.moskvinas@xxxxxxxxx> · Sun, 24 Apr 2005 19:13:01 +0300

> However, my first question still remains since I
> cannot
> reasoning about the 16 bytes padding for the array
> buffer1.
I think the reason is that you chose to use int *res; remove this
variable, and you'll see that now gcc is trying different align
strategy.

It might be like that:
1. align char[5] to 8.
2. align char[10] to 16
3 align int *res to 4.
Some misalignment (the biggest member is size 16):
1. align char[5] to 16
2. align char[10] to 16
3. align int *res to 16
now we have 48. Let's align it to 64 (2^6)
1. add padding 16 bytes.
1. align char[5] to 16.
2. align char[10] to 16.
3. align int *res to 16.

I think we lost 4 bytes for return adress, and additionally 4 bytes
for putting EBP onto stack.
And we have 56bytes.

Arturas Moskvinas
P.S.: From intel optimization guide:
ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf
"Employ data structure layout optimization to ensure efficient use of
64-byte cache line size."
AMD is not talking much about the alignment, they only say it to be
multitiply to double word, quadword.
X86 Processors allow misaligned memory access, but it cost at least to
memory read cycles to read it!