Rusty Russell wrote: > I want to make three changes to this over time: > > 1) Copy the ops structure in the asm, based on value of %ebx (0 == xen, > etc). Only copy the non-NULL entries, to make implementing ops simple > (eg. Xen doesn't want to override all ops). Xen wants %esi, so I might > have to move that to %eax: I'll see how it works out. > I'm coming to the conclusion that having separate entrypoints for each hypervisor is really the right way to go. Assuming that all hypervisors have some way to set the entrypoint, then its reasonable to also assume they're each going to have a different method for doing do (ideally orthogonal to each other). That means that if Xen does it with its __xen_guest string section and VMI does it some other (possibly similar) way, they can all get along. It isn't clear to me who sets %ebx in your proposal. If you're suggesting that the hypervisors do it, it seems like a bit presumptuous to have a specific mechanism just for us. If you're saying that a common PV startup function needs to try to sniff what hypervisor its under, that seems very tricky, particularly since we can't take any faults at that point. Its also possible that a hypervisor is fully virtualizing, so that boot proceeds via the normal startup_32 path, but at some later point the guest can register some pv_ops for better performance. (Similar to your idea for an in-kernel modular hypervisor.) At some future point, it would be nice to be able to load replacement paravirt ops implementations via multiboot/grub modules (or some similar mechanism), so that a the interface to the hypervisor can be updated for old guests. I envision that it would steal control away from the in-kernel paravirt code by redirecting the entrypoint, install a new pv_ops structure and then boot as normal (I haven't investigated this at all; this is just the most plausible-sounding mechanism I thought of). This would be the moral equivalent of compiling a new scsi driver for an old kernel in order to support new hardware. It's similar to the idea that VMI can replace the ROM from boot to boot, but at a source-level API level rather than a fixed long-term ABI. I also considered the idea of having NULL pointers in the pv_ops structure and only copying non-NULL pointers, but I decided against it. It seems cleaner to me to explicitly set the pointer to the nopara_ function, so that you can easily look at the structure and see which functions have been implemented and which have been forgotten. > 2) Call *paravirt_ops.init rather than hardcoded xen_start_kernel. > That seems particularly pointless. By the time you need to call it, you already know which hypervisor you're under so you could just call it directly. Since there's not much common code between the various hypervisor startups (not much code at all, full-stop), there doesn't seem to be much scope for usefully sharing code. > 3) Rename from xen-head.S to paravirt-head.S. > My plan was that there would be a paravirt-foo directory for each hypervisor, and a corresponding foo-specific entrypoint in head.S, which would be included from foo-head.S. >> I also haven't really gone over the list of paravirt ops in detail to >> see if they're really what we want; I figure that will come up as I keep >> adapting Xen to the interface. But an obvious seems to be we should >> have explicit flush_tlb/multicast_flush_tlb calls rather than simply >> relying on reloading cr3. >> > > Yep, and I thought about set_tss_desc, rather than lower-level ops, > because Xen doesn't want it at all. But see how you go.. > Yeah. I was just looking at load_idt, which is pretty strange. The Xen version of it ignores the argument and always loads its own exception table. I'm thinking that if we need it at all, it should be called something like load_exceptions(void). Perhaps it should take the argument, but only do something if it == &idt_descr... J