Andi Kleen wrote: > > >> (1) We can make startup_32 work for every known and future reasonable >> hypervisor as well as native, by testing if ring isn't 0 and paging is >> enabled and jumping to the paravirt entry path. >> > > Somehow the right Hypervisor still needs to be discovered though > so that the right paravirt ops can be installed. > > We would need a standard interface for this. > > Two possible ways: > - Let the Hypervisor pass an identifier through the boot protocol > - Do some instruction that traps and expect the Hypervisor > to fill in some registers or memory after that. > Passing an identifier is not an option for our hypervisor today. We rely far too heavily on the BIOS to properly initialize devices, PCI I/O space, the E820 map, and many other things that can't be done in paravirtual startup mode, without a ton of work. Rather than set all this up in paravirtual start-of-day, it is easier for us to fully virtualize the processor and chipset during the boot process. Our boot detection code looks like this. We use I/O port detection to reveal that we are running under VMware, and use a hidden MSR to enable paravirt mode. The problem with this is that choosing an MSR that is guaranteed not to trap on real hardware and also does not collide with existing MSRs is a problem, and handling faults on an RDMSR which the processor decided to convert to a #GP is impossible this early during boot. This may (and does) vary by processor. Hence, the I/O port detection as a failsafe. Technically, we can initiate entry into paravirt mode at any time after paging has been enabled. VMI_ENTRY_INIT(Init) push %ebx /* Guarantee irq free */ pushf cli /* Check for VMware */ movl $0x564d5868,%eax xor %ebx,%ebx movl $0x0a,%ecx movw $0x5658,%dx in %dx,%eax cmpl $0x564d5868,%ebx jne noParaVM /* Extract GDT/IDT/LDT/TR to stack prior to direct execution */ sgdt STACK_GDTR(%esp) sidt STACK_IDTR(%esp) sldt STACK_LDT(%esp) str STACK_TR(%esp) /* Paravirt enable */ mov $1,%eax movl $MSR_VM_EFER,%ecx wrmsr /* Check success */ xor %eax,%eax rdmsr test %eax, %eax jz noParaVM