Hi Eric, Thanks for having a look at this. Eric W. Biederman wrote: > The linux 32bit entry point is well defined. > %ebx holds the cpu number from a bootloader it must be 0. > %esi holds a pointer to the linux parameter blob, that is usually > filled in with BIOS calls. > > If you need hypervisor information at boot time and none of the > existing parameters in %esi will suffice, bump the boot protocol > version and allocate another variable. And find that variable > through %esi. It's an extra instruction but not too hard. > Bumping the boot protocol is probably desirable anyway because > it will all reporting that the kernel can be paravirtualized. > > /sbin/kexec ultimately has to operate in all of these environments, > and it would be insane if the bootloader had to be modified for a > different calling convention for each environment. If another > bootloader is being used it can be taught how to impedance match > between linux and the hypervisor environment, that is the job of > a bootloader. > > For those environments where paravirtualization is just an optimization > we mostly likely want the detection to happen in arch/i386/boot/setup.S > if it needs to happen early. > > As small food for thought. There is currently work in progress to > place an ELF header at the start of the bzImage to export the 32bit > entry point, and to export the capability of the kernel being > relocated. > > Hopefully we can get to the point we can boot a standard bzImage > kernel on the hypervisors as well. Even if we can't use the 16bit > entry point. > There are a few significant differences between a Xen boot up and a native one: Xen has already set up a clean flat 32-bit environment with paging enabled, so there's very little setup needed before getting into start_kernel (basically it just needs to set %esp and make sure the D flag is clear). We definitely don't want to be going into the 16-bit entrypoint. The kernel is running in ring != 0, so ring0 instructions will fault, and popf misbehave. Xen can (and does) emulate some of the ring0 instructions, but not necessarily enough to deal with startup_32 (I haven't looked at this in detail yet). Either way, startup_32 would need to be modified to avoid the difficult cases. Xen also supports privileged kernels which run in ring 0, but they're stlil fully paravirtualized kernels; they should not use their ring0 status to set up the processor state without doing it through Xen. At present, Xen also passes a pointer to an info-block in %esi. We could hang that off a normal boot params block if that looked like a useful thing to do. Also, the set of supported CPUs is smaller, so most of the cpuid stuff is reundant. It would also need to be redone using the Xen version of cpuid to get a correct set of information. This all makes me think it would be more awkward than helpful to have the Xen boot path go through the normal startup_32 path. Zach proposed a change to the beginning of startup_32 to see if its running in ring != 0 or if paging is already enabled, and then jumping to a startup_paravirt entrypoint. That's workable, but it essentially means we're creating a distinct hypervisor boot protocol. That's not necessarily a bad thing - and it could be made to look more like the normal boot protocol - but because the setup code is so simple there doesn't seem to be a lot to be gained from it. In the Xen case, it makes more sense to simply have a separate Xen-specific entrypoint to do a little bit of setup before jumping into start_kernel. J