28.06.2021 13:56, Vitaly Kuznetsov пишет:
stsp <stsp2@xxxxxxxxx> writes:
Yes, cancel_injection is supposed to
be always broken indeed. But there
are a few more things to it.
Namely:
- Other CPUs do not seem to exhibit
that path. My guess here is that they
just handle the exception in hardware,
without returning to KVM for that. I
am not sure why Core2 vmexits per
each page fault. Is it incapable of
handling the PF in hardware, or maybe
some other bug is around?
Wild guess: no EPT support and running on shadow pages?
That's something you should tell
me, and not the other way around. :)
I am just working with kvm as a user.
- Even if you followed the broken
path, in most cases everything is still
fine: the exception will just be re-injected.
The unfortunate scenario is when you
have _TIF_SIGPENDING at exactly
right place. Then you go to user-space,
and the user-space is unlucky to use
SET_REGS right here. These conditions
are not very likely to happen. I wrote a
test-case for it, but it involves the entire
buildroot setup and you need to wait
a bit while it is trying to trigger the race.
Maybe there's an easier way to trigger imminent exit to userspace which
doesn't involve
Any API to intercept all guest exceptions?
But even if there is, I am afraid in that
case cancel_injection is not going to be
executed. It is executed only when
kvm_vcpu_exit_request(vcpu) returns true.
... which makes such scenario valid?
We should not add userspace-triggerable WARNs in kernel, right. I was
not sure if the WARN you add stays triggerable post-patch.
I thought its not - at least not when
the exceptions are coming from the
guest. Maybe WARN_ON() can somehow
check if the exception was injected by
user-space, like by checking the events
bitmask?
Or I'll just remove it.
Maybe the huge amount of injected #PFs (which are triggered because
there's no EPT) contribute to the easiness of the reproduction? Purely
from from looking at the code of your patch, the issue should also
happen with other exceptions, KVM just doesn't inject them that
often.
I haven't seen the same race with
any other exception, like with GP.
I suspect there is no vmexit
for GP, so it will just be injected in
hardware. I think only PF makes
the problem because of the shadow
page tables, as you pointed before.
While I haven't made a specific
test-case to try GP, I am quite sure
it would have been observed long
ago because GPs in our usage
scenarios are much more frequent
than PFs. But the race was never
observed with GP.