Re: [PATCH v5 04/27] x86/fpu/xstate: Add XSAVES system states for shadow stack

Matthew Wilcox <willy@xxxxxxxxxxxxx> · Thu, 8 Nov 2018 16:32:25 -0800

On Thu, Nov 08, 2018 at 03:35:02PM -0800, Dave Hansen wrote:
> On 11/8/18 2:00 PM, Matthew Wilcox wrote:
> > struct a {
> > 	char c;
> > 	struct b b;
> > };
> > 
> > we want struct b to start at offset 8, but with __packed, it will start
> > at offset 1.
> 
> You're talking about how we want the struct laid out in memory if we
> have control over the layout.  I'm talking about what happens if
> something *else* tells us the layout, like a hardware specification
> which is what is in play with the XSAVE instruction dictated layout
> that's in question here.
> 
> What I'm concerned about is a structure like this:
> 
> struct foo {
>         u32 i1;
>         u64 i2;
> };
> 
> If we leave that to natural alignment, we end up with a 16-byte
> structure laid out like this:
> 
> 	0-3	i1
> 	3-8	alignment gap
> 	8-15	i2

I know you actually meant:

	0-3	i1
	4-7	pad
	8-15	i2

> Which isn't what we want.  We want a 12-byte structure, laid out like this:
> 
> 	0-3	i1
> 	4-11	i2
> 
> Which we get with:
> 
> struct foo {
>         u32 i1;
>         u64 i2;
> } __packed;

But we _also_ get pessimised accesses to i1 and i2.  Because gcc can't
rely on struct foo being aligned to a 4 or even 8 byte boundary (it
might be embedded in "struct a" from above).

> Now, looking at Yu-cheng's specific example, it doesn't matter.  We've
> got 64-bit types and natural 64-bit alignment.  Without __packed, we
> need to look out for natural alignment screwing us up.  With __packed,
> it just does what it *looks* like it does.

The question is whether Yu-cheng's struct is ever embedded in another
struct.  And if so, what does the hardware do?