On Fri, Jul 28, 2023, Michal Luczaj wrote: > In a spirit of using a sledgehammer to crack a nut, make sync_regs() feed > __set_sregs() and kvm_vcpu_ioctl_x86_set_vcpu_events() with kernel's own > copy of data. > > Both __set_sregs() and kvm_vcpu_ioctl_x86_set_vcpu_events() assume they > have exclusive rights to structs they operate on. While this is true when > coming from an ioctl handler (caller makes a local copy of user's data), > sync_regs() breaks this contract; a pointer to a user-modifiable memory > (vcpu->run->s.regs) is provided. This can lead to a situation when incoming > data is checked and/or sanitized only to be re-set by a user thread running > in parallel. LOL, the really hilarious part is that the guilty, Fixes: 01643c51bfcf ("KVM: x86: KVM_CAP_SYNC_REGS") also added this comment... /* kvm_sync_regs struct included by kvm_run struct */ struct kvm_sync_regs { /* Members of this structure are potentially malicious. * Care must be taken by code reading, esp. interpreting, * data fields from them inside KVM to prevent TOCTOU and * double-fetch types of vulnerabilities. */ struct kvm_regs regs; struct kvm_sregs sregs; struct kvm_vcpu_events events; }; though Radim did remove something so maybe the comment isn't as ironic as it looks. [Removed wrapper around check for reserved kvm_valid_regs. - Radim] Signed-off-by: Radim Krčmář <rkrcmar@xxxxxxxxxx> Anyways... > Signed-off-by: Michal Luczaj <mhal@xxxxxxx> > --- > A note: when servicing kvm_run->kvm_dirty_regs, changes made by > __set_sregs()/kvm_vcpu_ioctl_x86_set_vcpu_events() to on-stack copies of > vcpu->run.s.regs will not be reflected back in vcpu->run.s.regs. Is this > ok? I would be amazed if anyone cares. Given the justification and the author, This reduces ioctl overhead which is particularly important when userspace is making synchronous guest state modifications (e.g. when emulating and/or intercepting instructions). Signed-off-by: Ken Hofsass <hofsass@xxxxxxxxxx> I am pretty sure this was added to optimize a now-abandoned Google effort to do emulation in uesrspace. I bring that up because I was going to suggest that we might be able to get away with a straight revert, as QEMU doesn't use the flag and AFAICT neither does our VMM, but there are a non-zero number of hits in e.g. github, so sadly I think we're stuck with the feature :-(