On Fri, Jun 18, 2021, Jim Mattson wrote: > On Fri, Jun 18, 2021 at 2:55 PM stsp <stsp2@xxxxxxxxx> wrote: > > > > 19.06.2021 00:07, Jim Mattson пишет: > > > On Fri, Jun 18, 2021 at 9:02 AM stsp <stsp2@xxxxxxxxx> wrote: > > > > > >> Here it goes. > > >> But I studied it quite thoroughly > > >> and can't see anything obviously > > >> wrong. > > >> > > >> > > >> [7011807.029737] *** Guest State *** > > >> [7011807.029742] CR0: actual=0x0000000080000031, > > >> shadow=0x00000000e0000031, gh_mask=fffffffffffffff7 > > >> [7011807.029743] CR4: actual=0x0000000000002041, > > >> shadow=0x0000000000000001, gh_mask=ffffffffffffe871 > > >> [7011807.029744] CR3 = 0x000000000a709000 > > >> [7011807.029745] RSP = 0x000000000000eff0 RIP = 0x000000000000017c > > >> [7011807.029746] RFLAGS=0x00080202 DR7 = 0x0000000000000400 > > >> [7011807.029747] Sysenter RSP=0000000000000000 CS:RIP=0000:0000000000000000 > > >> [7011807.029749] CS: sel=0x0097, attr=0x040fb, limit=0x000001a0, > > >> base=0x0000000002110000 > > >> [7011807.029751] DS: sel=0x00f7, attr=0x0c0f2, limit=0xffffffff, > > >> base=0x0000000000000000 > > > I believe DS is illegal. Per the SDM, Checks on Guest Segment Registers: > > > > > > * If the guest will not be virtual-8086, the different sub-fields are > > > considered separately: > > > - Bits 3:0 (Type). > > > * DS, ES, FS, GS. The following checks apply if the register is usable: > > > - Bit 0 of the Type must be 1 (accessed). > > > > That seems to be it, thank you! > > At least for the minimal reproducer > > I've done. > > > > So only with unrestricted guest its > > possible to ignore that field? > > The VM-entry constraints are the same with unrestricted guest. > > Note that *without* unrestricted guest, kvm will generally have to > emulate the early guest protected mode code--until the last vestiges > of real-address mode are purged from the descriptor cache. Maybe it > fails to set the accessed bits in the LDT on emulated segment register > loads? Argh! Check out this gem: /* * Fix the "Accessed" bit in AR field of segment registers for older * qemu binaries. * IA32 arch specifies that at the time of processor reset the * "Accessed" bit in the AR field of segment registers is 1. And qemu * is setting it to 0 in the userland code. This causes invalid guest * state vmexit when "unrestricted guest" mode is turned on. * Fix for this setup issue in cpu_reset is being pushed in the qemu * tree. Newer qemu binaries with that qemu fix would not need this * kvm hack. */ if (is_unrestricted_guest(vcpu) && (seg != VCPU_SREG_LDTR)) var->type |= 0x1; /* Accessed */ KVM fixes up segs when unrestricted guest is enabled, but otherwise leaves 'em be, presumably because it has the emulator to fall back on for invalid state. Guess what's missing in the invalid state check... I think this should do it: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 68a72c80bd3f..a753b9859826 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3427,6 +3427,8 @@ static bool data_segment_valid(struct kvm_vcpu *vcpu, int seg) if (var.dpl < rpl) /* DPL < RPL */ return false; } + if (!(var.type & VMX_AR_TYPE_ACCESSES_MASK)) + return false; /* TODO: Add other members to kvm_segment_field to allow checking for other access * rights flags