On Thu, Nov 08, 2018 at 03:35:02PM -0800, Dave Hansen wrote: > On 11/8/18 2:00 PM, Matthew Wilcox wrote: > > struct a { > > char c; > > struct b b; > > }; > > > > we want struct b to start at offset 8, but with __packed, it will start > > at offset 1. > > You're talking about how we want the struct laid out in memory if we > have control over the layout. I'm talking about what happens if > something *else* tells us the layout, like a hardware specification > which is what is in play with the XSAVE instruction dictated layout > that's in question here. > > What I'm concerned about is a structure like this: > > struct foo { > u32 i1; > u64 i2; > }; > > If we leave that to natural alignment, we end up with a 16-byte > structure laid out like this: > > 0-3 i1 > 3-8 alignment gap > 8-15 i2 I know you actually meant: 0-3 i1 4-7 pad 8-15 i2 > Which isn't what we want. We want a 12-byte structure, laid out like this: > > 0-3 i1 > 4-11 i2 > > Which we get with: > > struct foo { > u32 i1; > u64 i2; > } __packed; But we _also_ get pessimised accesses to i1 and i2. Because gcc can't rely on struct foo being aligned to a 4 or even 8 byte boundary (it might be embedded in "struct a" from above). > Now, looking at Yu-cheng's specific example, it doesn't matter. We've > got 64-bit types and natural 64-bit alignment. Without __packed, we > need to look out for natural alignment screwing us up. With __packed, > it just does what it *looks* like it does. The question is whether Yu-cheng's struct is ever embedded in another struct. And if so, what does the hardware do?