Re: [PATCH] KVM: X86: Fix exception untrigger on ret to user

Vitaly Kuznetsov <vkuznets@xxxxxxxxxx> · Mon, 28 Jun 2021 12:07:57 +0200

stsp <stsp2@xxxxxxxxx> writes:

> 28.06.2021 10:20, Vitaly Kuznetsov пишет:
>> Stas Sergeev <stsp2@xxxxxxxxx> writes:
>>
>>> When returning to user, the special care is taken about the
>>> exception that was already injected to VMCS but not yet to guest.
>>> cancel_injection removes such exception from VMCS. It is set as
>>> pending, and if the user does KVM_SET_REGS, it gets completely canceled.
>>>
>>> This didn't happen though, because the vcpu->arch.exception.injected
>>> and vcpu->arch.exception.pending were forgotten to update in
>>> cancel_injection. As the result, KVM_SET_REGS didn't cancel out
>>> anything, and the exception was re-injected on the next KVM_RUN,
>>> even though the guest registers (like EIP) were already modified.
>>> This was leading to an exception coming from the "wrong place".
>> It shouldn't be that hard to reproduce this in selftests, I
>> believe.
>
> Unfortunately the problem happens only on core2 CPU. I believe the reason
> is perhaps that more modern CPUs do not go to software for the exception
> injection?

Hm, I've completely missed that from the original description. As I read
it, 'cancel_injection' path in vcpu_enter_guest() is always broken when
vcpu->arch.exception.injected is set as we forget to clear it...

>
>
>>   'exception.injected' can even be set through
>> KVM_SET_VCPU_EVENTS and then we call KVM_SET_REGS.
>
> Does this mean I shouldn't add WARN_ON_ONCE()?

WARN_ON_ONCE() is fine IMO in case there's no valid case when
'vcpu->arch.exception.injected' is set during __set_regs(). selftest is
needed to check for '... this was leading to an exception coming from
the "wrong place"'.

>
>
>>   Alternatively, we can
>> trigger a real exception from the guest. Could you maybe add something
>> like this to tools/testing/selftests/kvm/x86_64/set_sregs_test.c?
> Even if you have the right CPU to reproduce that (Core2), you also
> need the _TIF_SIGPENDING at the right moment to provoke the cancel_injection
> path. This is like triggering a race. If you don't get _TIF_SIGPENDING
> then it will just re-enter guest and  inject the exception properly.

I'd like to understand the hardware dependency first. Is it possible
that the exception which causes the problem is not triggered on other
CPUs? We can find a different way to trigger an exception from selftest
then.

(Maybe it's just me who still struggles to see the full picure here,
hope Sean/Paolo will see the problem you're trying to address in no
time)

-- 
Vitaly