Single PV startup vs multiple PV startup

jeremy at goop.org (Jeremy Fitzhardinge) · Thu, 27 Jul 2006 14:35:21 -0700

Zachary Amsden wrote:
> Jeremy Fitzhardinge wrote:
>> Single entry pros:
>>
>>     * simpler control flow?
>>   
>
> Simpler to maintain the code.  Maintaining multiple intertwined entry 
> points is a mess.  It used to be done for Visual workstation, and the 
> code got merged back together.

They're not particularly complex or intertwined.  The startup routine's 
tasks are very simple:

   1. get the CPU into a state where we can run C code (ie, basically
      sane segments & stack)
   2. copy the right stuff into paravirt_ops
   3. jump into start_kernel

One could imagine step 1 might be complex, depending on how the 
hypervisor starts the world, but it should be fairly self-contained.

2 is probably a trivial memcpy, but may involve a bit more (selecting 
various different pv-ops implementations based on other state, choosing 
the right kernel ring, etc).

3 is a jmp.

If we're getting into a situation where the startups are more complex 
than that, then we need to fix that, and it probably means there's some 
other requirement we're not even considering now.

Fundamentally there's no complexity difference between having:

    hypervisor -> hypervisor_specific_init -> start_kernel

and

    hypervisor -> startup_pv -> hypervisor_specific_init -> start_kernel

except that startup_pv pretty much redundant.

I wouldn't object to having a helper function which does all the common 
stuff.  All I think is important is that the first kernel instruction 
executed is in a piece of hypervisor-specific code.

It could also be the case that several hypervisors decide they can quite 
happily share a given entrypoint; there's no reason why they couldn't.

>   Here's a pretty easy thing to do:
>
> ENTRY(startup_32)
>
> #ifdef CONFIG_PARAVIRT
>        movl %cs, %eax
>        testl $0x3, %eax
>        jnz pv_startup_32
> #endif
>
> Now you don't need any special magic at all to register an alternate 
> entry point.

Well, it assumes that your kernel is running ring != 0, or that all ring 
0 guests will want to go through the normal startup routine.  That's not 
the case with Xen right now - it supports running (trusted) ring 0 
kernels under the hypervisor, but they're otherwise fully paravirtualized.

>> Where do the ebx constants come from anyway?  If they're provided by 
>> the hypervisor itself, it means we need to have a registry of who's 
>> using what number, right?  We'd require them to all be unique (since 
>> collisions would be disastrous), and also they need to be densely 
>> allocated (single we're using them as an array index).
>>   
>
> Why is registering these constants hard?

Because it must be centralized and done up-front.   It means you can't 
just go and port Linux to your hypervisor without 1) registering with 
the Linux Paravirtual Hypervisor Assigned Numbers Authority, and 2) 
changing your side to set up %ebx with that number.  And who's going to 
be the LPHANA anyway?

> I prefer the symmetry of having all vendors use the same entry method 
> rather than encouraging ad-hoc hacks.

I think its a false symmetry.  It means that any port of linux to a PV 
hypervisor also requires the hypervisor itself to be changed for this 
interface, to use a Linux-specific domain builder.

    J