> > +.SS The layout and operation of the NT_X86_XSTATE regset > > Should rather have a complete table of NT_* entries first. The others > can be dummies for now. Oh boy, I'm not sure my man-page-formatting-fu is up to the task of creating a nice looking table :). Michael, can you help? > > +Obtain the kernel xsave component bitmask from the software-reserved area of the > > +xstate buffer. The software-reserved area beings at offset 464 into the xsave > > It would be better to put some struct defining this into the kernel uapi > and then reference that instead of magic numbers. We have user_xstateregs in the kernel, but that's not in the uapi. I suppose we should move it, given that it is exposed here. > > +buffer and the first 64 bits of this area contain the kernel xsave component bitmask > > +.IP 2. > > +Compute the offset of each state component by adding the sizes of all prior state > > +components that are enabled in the kernel xsave component bitmask, aligning to 64 byte boundaries along the way. This > > +format matches that of a compacted xsave area with XCOMP_BV set to the > > The sizes of these areas should probably also be in the uapi include Yes, that seems like a good idea. What's the policy on helper functions in uapi includes? Can we have helper functions that given a buffer and the kernel xstate mask, does this computation for you? > > +kernel component bitmask. Further details on the layout of the compacted xsave > > +area may be found in the Intel architecture manual, but note that the xsave > > +buffer returned from ptrace will have its XCOMP_BV set to 0. > > The note seems disconnected. You'll have to explain it here. Ok, I'll elaborate. The point I wanted to make is that even though the buffer looks compressed, XCOMP_BV is 0, so it's not a valid compressed buffer that can be xrstor'ed. > > +Thus, to obtain an xsave area that may be set back to the tracee, all unused > > +state components must first be re-set to the correct initial state for the > > +corresponding state component, and the XSTATE_BV bitfield must subsequently > > +be adjusted to match the kernel xstate component bitmask (obtained as > > +described above). > > I wonder if we shouldn't just fix the kernel to do this properly on its > own. Presumably it won't break any existing user space. > > It seems more a bug than something that should be a documented ABI. I'd be happy to see this interface improved, since I do think it wasn't quite intended to work this way when originally conceived (i.e. originally, before the init optimization and before we had flags that turn off various xstate components resulting in a compressed buffer). As I said in the other email thread, I think it would be reasonable to change the offsets to always be non-compressed, which would at least make this a normal xsave buffer. No ptracer that I looked at knows that this buffer can be compressed, so changing the kernel behavior here would actually make it closer to what existing userspace expects ;). I'm not sure what to do about the getregset/setregset mismatch. On the one hand it's pretty bad, but on the other hand, I'm not really sure what to do about it, short of introducing a different NT_X86_* constant that behaves differently. > > + > > +The value of the kernel's state component bitmask is determined on boot and > > +need not be equivalent to the maximal set of state components supported by the > > +CPU (as enumerated through CPUID). > > Okay so how should someone get it? Looks like that's a hole in the > kernel API that we need to fix somehow. The cpuid enumerated value does still represent a maximum so that can be used to size the buffer and the actual value can then be read from the software saved area as described here. Is that what you were asking? Not sure I understood correctly. Keno