On Fri, Feb 24, 2023, Chang S. Bae wrote: > On 2/24/2023 3:56 PM, Mingwei Zhang wrote: > > On Wed, Feb 22, 2023, Chang S. Bae wrote: > > > > > > /* > > > - * The ptrace buffer is in non-compacted XSAVE format. In > > > - * non-compacted format disabled features still occupy state space, > > > - * but there is no state to copy from in the compacted > > > - * init_fpstate. The gap tracking will zero these states. > > > + * Indicate which states to copy from fpstate. When not present in > > > + * fpstate, those extended states are either initialized or > > > + * disabled. They are also known to have an all zeros init state. > > > + * Thus, remove them from 'mask' to zero those features in the user > > > + * buffer instead of retrieving them from init_fpstate. > > > */ > > > - mask = fpstate->user_xfeatures; > > > > Do we need to change this line and the comments? I don't see any of > > these was relevant to this issue. The original code semantic is to > > traverse all user_xfeatures, if it is available in fpstate, copy it from > > there; otherwise, copy it from init_fpstate. We do not assume the > > component in init_fpstate (but not in fpstate) are all zeros, do we? If > > it is safe to assume that, then it might be ok. But at least in this > > patch, I want to keep the original semantics as is without the > > assumption. > > Here it has [1]: > > * > * XSAVE could be used, but that would require to reshuffle the > * data when XSAVEC/S is available because XSAVEC/S uses xstate > * compaction. But doing so is a pointless exercise because most > * components have an all zeros init state except for the legacy > * ones (FP and SSE). Those can be saved with FXSAVE into the > * legacy area. Adding new features requires to ensure that init > * state is all zeroes or if not to add the necessary handling > * here. > */ > fxsave(&init_fpstate.regs.fxsave); ah, I see. > > Thus, init_fpstate has zeros for those extended states. Then, copying from > init_fpstate is the same as membuf_zero() by the gap tracking. But, we have > two ways to do the same thing here. > > So I think it works that simply copying the state from fpstate only for > those present there, then letting the gap tracking zero out for the rest of > the userspace buffer for features that are either disabled or initialized. > > Then, we can remove accessing init_fpstate in the copy loop and which is the > source of the problem. So I think this line change is relevant and also > makes the code simple. > > I guess I'm fine if you don't want to do this. Then, let me follow up with > something like this at first. Something like yours could be a fallback > option for other good reasons, otherwise. hmm. I see. But this is still because of the software implementation. What if there is a new hardware component that requires a non-zero init state. For instance, in the past, we had PKRU component, whose init value is 0x555...54. Of course, that is a bad example because now we kick it out of the XSAVE/XRSTOR and special handling that, but there is no guarantee that in the future we will never need a non-zero init state. So, I will send out my fix and let you, Thomas and potentially other folks to decide what is the best option. Overall, I get your point. Thanks -Mingwei > > Thanks, > Chang > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/fpu/xstate.c#n386 > >