Re: [PATCH v2] KVM: X86: Fix exception untrigger on ret to user

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2021-07-06 at 15:06 +0300, stsp wrote:
> 06.07.2021 14:49, Maxim Levitsky пишет:
> > Now about the KVM's userspace API where this is exposed:
> >   
> > I see now too that KVM_SET_REGS clears the pending exception.
> > This is new to me and it is IMHO *wrong* thing to do.
> > However I bet that someone somewhere depends on this,
> > since this behavior is very old.
> 
> What alternative would you suggest?
> Check for ready_for_interrupt_injection
> and never call KVM_SET_REGS if it indicates
> "not ready"?
> But what if someone calls it nevertheless?
> Perhaps return an error from KVM_SET_REGS
> if exception is pending? Also KVM_SET_SREGS
> needs some treatment here too, as it can
> also be called when an exception is pending,
> leading to problems.

As I explained you can call KVM_GET_VCPU_EVENTS before calling
KVM_SET_REGS and then call KVM_SET_VCPU_EVENTS with the struct
that was filled by KVM_GET_VCPU_EVENTS.
That will preserve all the cpu events.

> 
> 
> > This doesn't affect qemu since when it does KVM_SET_REGS,
> > it also does KVM_SET_VCPU_EVENTS afterward, and that
> > restores either pending or injected exception state.
> > (that state is first read with KVM_GET_VCPU_EVENTS ioctl).
> >   
> > Also note just for reference that KVM_SET_SREGS has ability
> > to set a pending interrupt, something that is also redundant
> > since KVM_SET_VCPU_EVENTS does this.
> > I recently added KVM_SET_SREGS2 ioctl which now lacks this
> > ability.
> > 
> > KVM_SET_VCPU_EVENTS/KVM_GET_VCPU_EVENTS allows to read/write
> > state of both pending and injected exception and on top of that
> > allows to read/write the exception payload of a pending exception.
> > 
> > You should consider using it to preserve/modify the exception state,
> > although the later isn't recommended (qemu does this in few places,
> > but it is a bit buggy, especially in regard to nested guests).
> I need neither to preserve nor modify
> the exception state. All I need is to safely
> call KVM_SET_REGS/KVM_SET_SREGS, but
> that appears unsafe when exception is
> pending.
> 
> Please take a look into this commit:
> https://www.lkml.org/lkml/2020/12/1/324
> 
> It was suggested that the removal
> of "!kvm_event_needs_reinjection(vcpu)"
> condition from kvm_vcpu_ready_for_interrupt_injection()
> is what introduced the problem, as
> right now ready_for_interrupt_injection
> doesn't account for pending exception.
> I already added the check to my
> user-space code, and now it works
> reliably on some very old kernels
> prior to the aforementioned patch.
> So should we treat that as a regressions,
> or any other proposal?

I haven't studied the userspace interrupt injection code much but
it does sound like if we signal to userspace that irq window is
open, that means that indeed there must be no injected interrupts/exceptions.

Now let me explain how nesting of events supposed to work on real hardware:

Event nesting happens when during delivery of an event (interrupt or exception)
we got another event (it has to be exception practically, and it happens
only when the delivery process (like switching stacks, reading IDT,
pushing stuff to the stack, etc causes an exception)

There are 4 combinations of an event, and event that happened  during delivery
of it.


1. Exception->Exception:
There is an  exception, and during delivery of it we got an exception
(like #PF on accessing GDT, or #NP, #SS, or something like that)
In this case exceptions are merged which means that they are either converted to 
#DF,  first exception is lost, or we get a triple fault (SDM has a table
for all cases).


2. Interrupt->Exception
If there is an injected interrupt and during delivery of it, we get an exception
this means that jump to the interrupt handler caused an exception,
in which case the interrupt is lost and we run the exception handler instead.


3. Interrupt->Interrupt
4. Exception->Interrupt.

Those two cases can't happen, as interrupts should not be processed while delivering
an exception/interrupt.

Therefore indeed if we signal the userspace that interrupt window is open,
that means that there must be no injected interrupt/exception.

If the userspace however wants to inject an exception, currently it
can only correctly do so if it uses KVM_GET_VCPU_EVENTS, to see if we
already have a injected exception there and then merge itself new
exception. And that still doesn't work correctly when nested guests
are involved. As I said the userspace exception injection via KVM_GET_VCPU_EVENTS
is kind of broken, and only works in qemu since it uses it very rarely.

Paolo, what do you think?


Best regards,
	Maxim Levitsky




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux