Dropped my old @intel email to stop getting bounces. On Mon, Jun 14, 2021, stsp wrote: > 14.06.2021 20:06, Sean Christopherson пишет: > > On Sun, Jun 13, 2021, stsp wrote: > > > Hi kvm developers. > > > > > > I am having the strange problem that can only be reproduced on a core2duo CPU > > > but not AMD FX or Intel Core I7. > > > > > > My code has 2 ways of setting the guest registers: one is the guest's ring0 > > > stub that just pops all regs from stack and does iret to ring3. That works > > > fine. But sometimes I use KVM_SET_SREGS and resume the VM directly to ring3. > > > That randomly results in either a good run or invalid guest state return, or > > > a page fault in guest. > > Hmm, a core2duo failure is more than likely due to lack of unrestricted guest. > > You verify this by loading kvm_intel on the Core i7 with unrestricted_guest=0. > > Wow, excellent shot! Indeed, the problem then starts reproducing also there! > So at least I now have a problematic setup myself, rather than needing to ask > for ssh from everyone involved. :) > > What does this mean to us, though? That its completely unrelated to any > memory synchronization? Yes, more than likely this has nothing to do with memory synchronization. > > > I tried to analyze when either of the above happens exactly, and I have a > > > very strong suspection that the problem is in a way I update LDT. LDT is > > > shared between guest and host with KVM_SET_USER_MEMORY_REGION, and I modify > > > it on host. So it seems like if I just allocated the new LDT entry, there is > > > a risk of invalid guest state, as if the guest's LDT still doesn't have it. > > > If I modified some LDT entry, there can be a page fault in guest, as if the > > > entry is still old. > > IIUC, you are updating the LDT itself, e.g. an FS/GS descriptor in the LDT, as > > opposed to updating the LDT descriptor in the GDT? > > I am updating the LDT itself, not modifying its descriptor in gdt. And with > the same KVM_SET_SREGS call I also update the segregs to the new values, if > needed. Hmm, unconditionally calling KVM_SET_SREGS if you modify anything in the LDT would be worth trying. Or did I misunderstand the "if needed" part? > > Either way, do you also update all relevant segments via KVM_SET_SREGS after > > modifying memory? > > Yes, if this is needed. Sometimes its not needed, and when not - it seems > page fault is more likely. If I also update segregs - then invalid guest > state. But these are just the statistical guesses so far. Ah. Hrm. It would still be worth doing KVM_SET_SREGS unconditionally, e.g. it would narrow the search if the page faults go away and the failures are always invalid guest state. > > Best guess is that KVM doesn't detect that the VM has state > > that needs to be emulated, or that KVM's internal register state and what's in > > memory are not consistent. > > Hope you know what parts are emulated w/o unrestricted guest, in which case > we can advance. :) It's not parts per se. KVM needs to emulate "everything", one instruction at a time, until guest state is no longer invalid with respec to the !unrestricted rules. > > Anyways, I highly doubt this is a memory synchronization issue, a corner case > > related to lack of unrestricted guest is much more likely. > > Just to be sure I tried the CD bit in CR0 to rule out the caching issues, and > that changes nothing. So... > > What to do next? In addition to the above experiment, can you get a state dump for the invalid guest state failure? I.e. load kvm_intel with dump_invalid_vmcs=1. And on that failure, also provide the input to KVM_SET_SREGS. The LDT in memory might also be interesting, but it's hopefully unnecessary, especially if unconditionally doing kVM_SET_SREGS makes the page faults go away. Best case scenario is that KVM_SET_SREGS stuffs invalid guest state that KVM doesn't correct detect. That would be easy to debug and fix, and would give us a regression test as well.