Re: guest/host mem out of sync on core2duo?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dropped my old @intel email to stop getting bounces.

On Mon, Jun 14, 2021, stsp wrote:
> 14.06.2021 20:06, Sean Christopherson пишет:
> > On Sun, Jun 13, 2021, stsp wrote:
> > > Hi kvm developers.
> > > 
> > > I am having the strange problem that can only be reproduced on a core2duo CPU
> > > but not AMD FX or Intel Core I7.
> > > 
> > > My code has 2 ways of setting the guest registers: one is the guest's ring0
> > > stub that just pops all regs from stack and does iret to ring3.  That works
> > > fine.  But sometimes I use KVM_SET_SREGS and resume the VM directly to ring3.
> > > That randomly results in either a good run or invalid guest state return, or
> > > a page fault in guest.
> > Hmm, a core2duo failure is more than likely due to lack of unrestricted guest.
> > You verify this by loading kvm_intel on the Core i7 with unrestricted_guest=0.
> 
> Wow, excellent shot!  Indeed, the problem then starts reproducing also there!
> So at least I now have a problematic setup myself, rather than needing to ask
> for ssh from everyone involved. :)
> 
> What does this mean to us, though?  That its completely unrelated to any
> memory synchronization?

Yes, more than likely this has nothing to do with memory synchronization.

> > > I tried to analyze when either of the above happens exactly, and I have a
> > > very strong suspection that the problem is in a way I update LDT. LDT is
> > > shared between guest and host with KVM_SET_USER_MEMORY_REGION, and I modify
> > > it on host.  So it seems like if I just allocated the new LDT entry, there is
> > > a risk of invalid guest state, as if the guest's LDT still doesn't have it.
> > > If I modified some LDT entry, there can be a page fault in guest, as if the
> > > entry is still old.
> > IIUC, you are updating the LDT itself, e.g. an FS/GS descriptor in the LDT, as
> > opposed to updating the LDT descriptor in the GDT?
> 
> I am updating the LDT itself, not modifying its descriptor in gdt. And with
> the same KVM_SET_SREGS call I also update the segregs to the new values, if
> needed.

Hmm, unconditionally calling KVM_SET_SREGS if you modify anything in the LDT
would be worth trying.  Or did I misunderstand the "if needed" part?

> > Either way, do you also update all relevant segments via KVM_SET_SREGS after
> > modifying memory?
> 
> Yes, if this is needed.  Sometimes its not needed, and when not - it seems
> page fault is more likely. If I also update segregs - then invalid guest
> state.  But these are just the statistical guesses so far.

Ah.  Hrm.  It would still be worth doing KVM_SET_SREGS unconditionally, e.g. it
would narrow the search if the page faults go away and the failures are always
invalid guest state.

> >     Best guess is that KVM doesn't detect that the VM has state
> > that needs to be emulated, or that KVM's internal register state and what's in
> > memory are not consistent.
> 
> Hope you know what parts are emulated w/o unrestricted guest, in which case
> we can advance. :)

It's not parts per se.  KVM needs to emulate "everything", one instruction at a
time, until guest state is no longer invalid with respec to the !unrestricted
rules.

> > Anyways, I highly doubt this is a memory synchronization issue, a corner case
> > related to lack of unrestricted guest is much more likely.
> 
> Just to be sure I tried the CD bit in CR0 to rule out the caching issues, and
> that changes nothing.  So...
>
> What to do next?

In addition to the above experiment, can you get a state dump for the invalid
guest state failure?  I.e. load kvm_intel with dump_invalid_vmcs=1.  And on that
failure, also provide the input to KVM_SET_SREGS.  The LDT in memory might also
be interesting, but it's hopefully unnecessary, especially if unconditionally
doing kVM_SET_SREGS makes the page faults go away.

Best case scenario is that KVM_SET_SREGS stuffs invalid guest state that KVM
doesn't correct detect.  That would be easy to debug and fix, and would give us
a regression test as well.



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux