Thomas Gleixner wrote: > This is tinkering of the best. My understanding of the paravirt > discussion at Kernel Summit was, that paravirt ops are exactly there to > prevent the above random hackery in the kernel and to allow _ALL_ > hypervisors to interact via a sane interface inside of the kernel. > No, I don't think that was ever the intent. The idea was to create a new interface for things which don't currently have an interface in the kernel, such as how to run the CPU in ring 1 and manage pagetable updates. But an important and explicit intent of the project was to use existing kernel interfaces where possible, rather than try to make pv_ops an monster all-encompassing interface. Using the new time infrastructure was an explicit example of that. We anticipated that different hypervisors would have different ways of doing time, but all would be easily accommodated by the clocksource/events infrastructure, and so each would have its own implementation for these interfaces. From the kernel's perspective, they're just another time device, and we manage to avoid making any core kernel changes, or bloating the pv_ops interface. It seems like a natural use of the clock subsystem's design. > You are just perverting the whole idea of a standartized > paravirtualization interface. > > This things can be done for clocksources, clockevents, interrupts (the > generic irq code allows this) and probaly for a whole bunch of other > stuff. > Yes, exactly. The entirety of the Xen support consists of not only an implementation of the paravirt_ops interface, but also the Xen clocksource and clockevents and the Xen irqchip. My hope and intent is that we can shrink the paravirt_ops interface in favour of using existing generally useful kernel interfaces. > The current paravirt interface is completely insane and will explode > into an unmaintainable nightmare within no time, if we keep accepting > that crap further. > No, that's exactly what we've been trying to avoid. If we start patching in new paravirt_ops to deal with time, interrupts, or whatever piece of functionality which already has a perfectly good kernel interface, then we're just increasing the size of the pv_ops interface, its entanglement with the rest of the system and the amount of potential legacy stuff which gets dragged around as the interface evolves. As hardware gets better at supporting virtualization directly, we're going to see more hybrid para- and fully- virtualized hypervisor interfaces. The result will be that more and more of paravirt_ops will be implemented by the "native" versions of the functions; maybe at some point the whole thing will evaporate away. It's not a huge reach to expect the hardware vendors to get a clue about time hardware (scratch that, of course it is, but we can always hope) and come up with something that is directly usable from either an OS running natively or from within a virtual machine. In that case, I'm sure you'd agree it would warrant a real clocksource/event implementation. In the scheme I'm proposing, that's no big deal; you just register the hardware driver, and that's that. But what you're proposing leaves this vestigial interface sitting in pv_ops, doing nothing other than being redundant. My principle goal here is to get the Xen code into the kernel, and I'm being pragmatic about it. If you think having a xen_clocksource is an absolute blocker to merging this stuff, then I'll add the interface to pv_ops, and we'll work out how to wire all the hypervisors up underneath that interface. But I think it's precisely the wrong way to go from an overall kernel perspective. J _______________________________________________ Virtualization mailing list Virtualization@xxxxxxxxxxxxxx https://lists.osdl.org/mailman/listinfo/virtualization