On Thu, Feb 29, 2024, Vitaly Kuznetsov wrote: > Sean Christopherson <seanjc@xxxxxxxxxx> writes: > > Am I missing something, or can we just swap() the new and old, update the new > > in the context of the vCPU, and then undo the swap() if there's an issue? > > vcpu->mutex is held, and accessing this state from a different task is wildly > > unsafe, so I don't see any problem with temporarily having an in-flux state. > > > > I don't see why this approach shouldn't work and I agree it looks like > it would make things better but I can't say that I'm in love with > it. Agreed, but the lack of atomicity is a pre-existing problem, though as proposed, my idea would make it worse. More below. > Ideally, I would want to see the following "atomic" workflow for all > updates: > > - Check that the supplied data is correct, return an error if not. No > changes to the state on this step. > - Tweak the data if needed. > - Update the state and apply the side-effects of the update. Ideally, > there should be no errors on this step as rollback can be > problemmatic. In the real world we will have to handle e.g. failed > memory allocations here but in most cases the best course of action is > to kill the VM. > > Well, kvm_set_cpuid() is not like that. At least: > - kvm_hv_vcpu_init() is a side-effect but we apply it before all checks > are complete. There's no way back. > - kvm_check_cpuid() sounds like a pure checker but in reallity we end up > mangling guest FPU state in fpstate_realloc() Yeah, I really, really don't like the call to fpu_enable_guest_xfd_features(). But to not make it worse, that call could be hoisted out of kvm_check_cpuid() so that it can be performed after kvm_cpuid_check_equal(), i.e. be kept dead last (and with a comment saying it needs to be dead last due to side effects that are visible to serspace). > Both are probably "no big deal" but certainly break the atomicity. > > > If we want to be paranoid, we can probably get away with killing the VM if the > > vCPU has run and the incoming CPUID is "bad", e.g. to guard against something > > in kvm_set_cpuid() consuming soon-to-be-stale state. And that's actually a > > feature of sorts, because _if_ something in kvm_set_cpuid() consumes the vCPU's > > CPUID, then we have a bug _now_ that affects the happy path. > > > > Completely untested (I haven't updated the myriad helpers), but this would allow > > us to revert/remove all of the changes that allow peeking at a CPUID array that > > lives outside of the vCPU. > > Thanks, assuming there's no urgency Definitely no urgency. > let me take a look at this in the course of the next week or so. No need, it was more of an "FYI, this is what I may go futz with". Specifically, it will impact what I want to do with guest cpu_caps[*], hopefully in a good way. My plan is to play around with it when I get back to that series. [*] https://lore.kernel.org/all/20231110235528.1561679-1-seanjc@xxxxxxxxxx