> However, my first question still remains since I > cannot > reasoning about the 16 bytes padding for the array > buffer1. I think the reason is that you chose to use int *res; remove this variable, and you'll see that now gcc is trying different align strategy. It might be like that: 1. align char[5] to 8. 2. align char[10] to 16 3 align int *res to 4. Some misalignment (the biggest member is size 16): 1. align char[5] to 16 2. align char[10] to 16 3. align int *res to 16 now we have 48. Let's align it to 64 (2^6) 1. add padding 16 bytes. 1. align char[5] to 16. 2. align char[10] to 16. 3. align int *res to 16. I think we lost 4 bytes for return adress, and additionally 4 bytes for putting EBP onto stack. And we have 56bytes. Arturas Moskvinas P.S.: From intel optimization guide: ftp://download.intel.com/design/Pentium4/manuals/24896611.pdf "Employ data structure layout optimization to ensure efficient use of 64-byte cache line size." AMD is not talking much about the alignment, they only say it to be multitiply to double word, quadword. X86 Processors allow misaligned memory access, but it cost at least to memory read cycles to read it!