Re: [PATCH v2] KVM: X86: Fix exception untrigger on ret to user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2021-06-28 at 17:29 +0300, Maxim Levitsky wrote:
> On Mon, 2021-06-28 at 15:48 +0300, Stas Sergeev wrote:
> > When returning to user, the special care is taken about the
> > exception that was already injected to VMCS but not yet to guest.
> > cancel_injection removes such exception from VMCS. It is set as
> > pending, and if the user does KVM_SET_REGS, it gets completely canceled.
> > 
> > This didn't happen though, because the vcpu->arch.exception.injected
> > and vcpu->arch.exception.pending were forgotten to update in
> > cancel_injection. As the result, KVM_SET_REGS didn't cancel out
> > anything, and the exception was re-injected on the next KVM_RUN,
> > even though the guest registers (like EIP) were already modified.
> > This was leading to an exception coming from the "wrong place".
> > 
> > This patch makes sure the vcpu->arch.exception.injected and
> > vcpu->arch.exception.pending are in sync with the reality (and
> > with VMCS). Also it adds clearing of pending exception to
> > __set_sregs() the same way it is in __set_regs(). See patch
> > b4f14abd9 that added it to __set_regs().
> > 
> > How to trigger the buggy scenario (that is, without this patch):
> > - Make sure you have the old CPU where shadow page tables
> > are used. Core2 family should be fine for the task. In this
> > case, all PF exceptions produce the exit to monitor.
> > - You need the _TIF_SIGPENDING flag set at the right moment
> > to get kvm_vcpu_exit_request() to return true when the PF
> > exception was just injected. In that case the cancel_injection
> > path is executed.
> > - You need the "unlucky" user-space that executes KVM_SET_REGS
> > at the right moment. This leads to KVM_SET_REGS not clearing
> > the exception, but instead corrupting its context.
> > 
> > v2 changes:
> > - do not add WARN_ON_ONCE() to __set_regs(). As explained by
> > Vitaly Kuznetsov, it can be user-triggerable.
> > - clear pending exception also in __set_sregs().
> > - update description with the bug-triggering scenario.
> 
> I used to know that area very very well, and I basically combed
> the whole thing back and forth, 
> and I have patch series to decouple injected and
> pending exceptions. 
> 
> I'll refresh my memory on this and then I'll review your patch.

Hi!
Sorry for the delayed response.
 
First of all indeed an exception can be either in pending or injected state.
A pending exception is an exception KVM created but that didn't yet change
the guest state. 
 
It still has to be injected to the guest on its current instruction,
thus its not like a pending interrupt.
 
An injected exception is an exception which was already injected to the guest
(or at least attempted an injection but that was aborted)

Since the exception payload is injected prior to pushing the error
code on the stack and then jumping to the exception handler,
the guest state is already modified.

There can be 2 reasons why exception delivery is aborted
like that:
 
1. By KVM (that is cancel_injection), when KVM needs to exit
to userspace due to a signal
 
2. By the CPU, when CPU detects another exception or paging fault,
while delivering this exception.
 
In both cases the exception stays in the 'injected' state, and
has to be injected again (that is why I don't really like the
'injected' term, since it is more like 'injection_started').

Now about the KVM's userspace API where this is exposed:
 
I see now too that KVM_SET_REGS clears the pending exception.
This is new to me and it is IMHO *wrong* thing to do.
However I bet that someone somewhere depends on this,
since this behavior is very old.
 
This doesn't affect qemu since when it does KVM_SET_REGS,
it also does KVM_SET_VCPU_EVENTS afterward, and that 
restores either pending or injected exception state.
(that state is first read with KVM_GET_VCPU_EVENTS ioctl).
 
Also note just for reference that KVM_SET_SREGS has ability
to set a pending interrupt, something that is also redundant
since KVM_SET_VCPU_EVENTS does this.
I recently added KVM_SET_SREGS2 ioctl which now lacks this
ability.

KVM_SET_VCPU_EVENTS/KVM_GET_VCPU_EVENTS allows to read/write
state of both pending and injected exception and on top of that
allows to read/write the exception payload of a pending exception.

You should consider using it to preserve/modify the exception state,
although the later isn't recommended (qemu does this in few places,
but it is a bit buggy, especially in regard to nested guests).

Best regards,
	Maxim Levitsky




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux