On Mon, Dec 22, 2014 at 3:14 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: > > > On 23/12/2014 00:00, Andy Lutomirski wrote: >> On Mon, Dec 22, 2014 at 2:49 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >>> >>> >>> On 22/12/2014 17:03, Andy Lutomirski wrote: >>>> This is wrong. The guest *kernel* might not see the intermediate >>>> state because the kernel (presumably it disabled migration while >>>> reading pvti), but the guest vdso can't do that and could very easily >>>> observe pvti while it's being written. >>> >>> No. kvm_guest_time_update is called by vcpu_enter_guest, while the vCPU >>> is not running, so it's entirely atomic from the point of view of the guest. >> >> Which vCPU? Unless kvm_guest_time_update freezes all of the vcpus, >> then there's a race: >> >> vCPU 0 guest: __getcpu >> vdso thread migrates to vCPU 1 >> vCPU 0 exits >> host starts writing pvti for vCPU 0 >> vdso thread starts reading pvti >> host finishes writing pvti for vCPU 0 >> vCPU 0 resumes >> vdso migrates back to vCPU 0 >> __getcpu returns 0 >> >> and we fail. > > Yes, it does. See kvm_gen_update_masterclock. > > See also http://www.spinics.net/lists/kvm/msg95533.html for some > discussion about KVM_REQ_MCLOCK_INPROGRESS. Ah. Assuming that works, then most of my patches are unnecessary. But then I have a different question: why do we bother doing the __getcpu at all? Can we rely on cpu 0's pvti to be appropriate for all of the vcpus to use if the stable bit is set? > >> I'm having a hard time testing, since KVM on 3.19-rc1 appears to be >> entirely unusable. No matter what I do, I get this very early in >> guest boot: >> >> KVM internal error. Suberror: 1 >> emulation failure >> EAX=000dee58 EBX=00000000 ECX=00000000 EDX=00000cfd >> ESI=00000059 EDI=00000000 EBP=00000000 ESP=00006fc4 >> EIP=000f17f4 EFL=00010012 [----A--] CPL=0 II=0 A20=1 SMM=0 HLT=0 >> ES =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] >> CS =0008 00000000 ffffffff 00c09b00 DPL=0 CS32 [-RA] >> SS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] >> DS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] >> FS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] >> GS =0010 00000000 ffffffff 00c09300 DPL=0 DS [-WA] >> LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT >> TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy >> GDT= 000f6c58 00000037 >> IDT= 000f6c96 00000000 >> CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000 >> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 >> DR3=0000000000000000 >> DR6=00000000ffff0ff0 DR7=0000000000000400 >> EFER=0000000000000000 >> Code=e8 75 fc ff ff 89 f2 a8 10 89 d8 75 0a b9 74 17 ff ff ff d1 <5b> >> 5e c3 5b 5e e9 76 ff ff ff 57 56 53 8b 35 38 65 0f 00 85 f6 0f 88 be >> 00 00 00 0f b7 f6 >> >> and it sometimes comes with a lockdep splat, too. > > I can look at it tomorrow. Does commit > 2c4aa55a6af070262cca425745e8e54310e96b8d work for you? Nope. Running: qemu-system-x86_64 -machine accel=kvm:tcg -cpu host -parallel none -net none -echr 1 -serial none -chardev stdio,id=console,signal=off,mux=on -serial chardev:console -mon chardev=console -vga none -display none from L1 where L1 is 3.19-rc1 or 2c4aa55a6af070262cca425745e8e54310e96b8d and L0 is a good Fedora kernel results in the same failure after a couple of seconds. This is on Sandy Bridge Extreme. I tried 3.19-rc1 on bare metal earlier today, and it didn't work any better. --Andy > > Paolo -- Andy Lutomirski AMA Capital Management, LLC -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html