Re: [PATCH v2] KVM: X86: Fix exception untrigger on ret to user

Jim Mattson <jmattson@xxxxxxxxxx> · Wed, 7 Jul 2021 10:39:56 -0700

On Wed, Jul 7, 2021 at 9:58 AM stsp <stsp2@xxxxxxxxx> wrote:
>
> 07.07.2021 19:46, Jim Mattson пишет:
> > On Wed, Jul 7, 2021 at 9:34 AM stsp <stsp2@xxxxxxxxx> wrote:
> >> 07.07.2021 19:16, Jim Mattson пишет:
> >>> On Tue, Jul 6, 2021 at 4:06 PM stsp <stsp2@xxxxxxxxx> wrote:
> >>>> 07.07.2021 02:00, Maxim Levitsky пишет:
> >>>>> On Wed, 2021-07-07 at 00:50 +0300, stsp wrote:
> >>>>>> 06.07.2021 23:29, Maxim Levitsky пишет:
> >>>>>>> On Tue, 2021-07-06 at 15:06 +0300, stsp wrote:
> >>>>>>>> 06.07.2021 14:49, Maxim Levitsky пишет:
> >>>>>>>>> Now about the KVM's userspace API where this is exposed:
> >>>>>>>>>
> >>>>>>>>> I see now too that KVM_SET_REGS clears the pending exception.
> >>>>>>>>> This is new to me and it is IMHO *wrong* thing to do.
> >>>>>>>>> However I bet that someone somewhere depends on this,
> >>>>>>>>> since this behavior is very old.
> >>>>>>>> What alternative would you suggest?
> >>>>>>>> Check for ready_for_interrupt_injection
> >>>>>>>> and never call KVM_SET_REGS if it indicates
> >>>>>>>> "not ready"?
> >>>>>>>> But what if someone calls it nevertheless?
> >>>>>>>> Perhaps return an error from KVM_SET_REGS
> >>>>>>>> if exception is pending? Also KVM_SET_SREGS
> >>>>>>>> needs some treatment here too, as it can
> >>>>>>>> also be called when an exception is pending,
> >>>>>>>> leading to problems.
> >>>>>>> As I explained you can call KVM_GET_VCPU_EVENTS before calling
> >>>>>>> KVM_SET_REGS and then call KVM_SET_VCPU_EVENTS with the struct
> >>>>>>> that was filled by KVM_GET_VCPU_EVENTS.
> >>>>>>> That will preserve all the cpu events.
> >>>>>> The question is different.
> >>>>>> I wonder how _should_ the KVM
> >>>>>> API behave when someone calls
> >>>>>> KVM_SET_REGS/KVM_SET_SREGS
> >>>>> KVM_SET_REGS should not clear the pending exception.
> >>>>> but fixing this can break API compatibilitly if some
> >>>>> hypervisor (not qemu) relies on it.
> >>>>>
> >>>>> Thus either a new ioctl is needed or as I said,
> >>>>> KVM_GET_VCPU_EVENTS/KVM_SET_VCPU_EVENTS can be used
> >>>>> to preserve the events around that call as workaround.
> >>>> But I don't need to preserve
> >>>> events. Canceling is perfectly
> >>>> fine with me because, if I inject
> >>>> the interrupt at that point, the
> >>>> exception will be re-triggered
> >>>> anyway after interrupt handler
> >>>> returns.
> >>> The exception will not be re-triggered if it was a trap,
> >> But my assumption was that
> >> everything is atomic, except
> >> PF with shadow page tables.
> >> I guess you mean the cases
> >> when the exception delivery
> >> causes EPT fault, which is a bit
> >> of a corner case.
> > No, that's not what I mean. Consider the #DB exception, which is
> > intercepted in all configurations to circumvent a DoS attack. Some #DB
> > exceptions modify DR6. Once the exception has been 'injected,' DR6 has
> > already been modified. If you do not complete the injection, but you
> > deliver an interrupt instead, then the interrupt handler can see a DR6
> > value that is architecturally impossible.
>
> Yes, I understand that part.
> It seems to be called the "exception
> payload" in kvm sources, and
> includes also CR2 for #PF.
> So of course if there are many
> non-atomic cases, rather than
> just one, then there are no doubts
> we need to check ready_for_injection.
> Its just that I was looking at that
> non-atomicity as a kvm's quirk,
> but its probably the fundamental
> part of vmx instead.

You could argue that it's a Linux quirk. Without EINTR, this probably
wouldn't be an issue.

> There is still the problem that
> KVM_SET_REGS cancels the
> injection and KVM_SET_SREGS not.
> But I realize you may want to leave
> it that way for compatibility.

Sadly, broken userspace APIs often have to remain broken. The "fix" is
probably to introduce KVM_SET_REGS2.