On 17.07.2013, at 17:36, Yoder Stuart-B08248 wrote: > > >> -----Original Message----- >> From: Alexander Graf [mailto:agraf@xxxxxxx] >> Sent: Wednesday, July 17, 2013 10:21 AM >> To: Yoder Stuart-B08248 >> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@xxxxxxxxxxxxxxx; kvm-ppc@xxxxxxxxxxxxxxx; Gleb Natapov >> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls >> >> >> On 17.07.2013, at 17:19, Yoder Stuart-B08248 wrote: >> >>> >>> >>>> -----Original Message----- >>>> From: Alexander Graf [mailto:agraf@xxxxxxx] >>>> Sent: Wednesday, July 17, 2013 7:19 AM >>>> To: Gleb Natapov >>>> Cc: Wood Scott-B07421; Bhushan Bharat-R65777; kvm@xxxxxxxxxxxxxxx; kvm-ppc@xxxxxxxxxxxxxxx; Yoder >>>> Stuart-B08248; Bhushan Bharat-R65777 >>>> Subject: Re: [PATCH 3/5] booke: define reset and shutdown hcalls >>>> >>>> >>>> On 17.07.2013, at 13:00, Gleb Natapov wrote: >>>> >>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote: >>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote: >>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote: >>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote: >>>>>>>>> There is no much sense to share hypercalls between architectures. >>>>>>>>> There >>>>>>>>> is zero probability x86 will implement those for instance >>>>>>>> >>>>>>>> This is similar to the question of whether to keep device API >>>>>>>> enumerations per-architecture... It costs very little to keep it in >>>>>>>> a common place, and it's hard to go back in the other direction if >>>>>>>> we later realize there are things that should be shared. >>>>>>>> >>>>>>> This is different from device API since with device API all arches >>>>>>> have >>>>>>> to create/destroy devices, so it make sense to put device lifecycle >>>>>>> management into the common code, and device API has single entry point >>>>>>> to the code - device fd ioctl - where it makes sense to handle common >>>>>>> tasks, if any, and despatch others to specific device implementation. >>>>>>> >>>>>>> This is totally unlike hypercalls which are, by definition, very >>>>>>> architecture specific (the way they are triggered, the way parameter >>>>>>> are passed from guest to host, what hypercalls arch needs...). >>>>>> >>>>>> The ABI is architecture specific. The API doesn't need to be, any >>>>>> more than it does with syscalls (I consider the >>>>>> architecture-specific definition of syscall numbers and similar >>>>>> constants in Linux to be unfortunate, especially for tools such as >>>>>> strace or QEMU's linux-user emulation). >>>>>> >>>>> Unlike syscalls different arches have very different ideas what >>>>> hypercalls they need to implement, so while with unified syscall space I >>>>> can see how it may benefit (very) small number of tools, I do not see >>>>> what advantage it will give us. The disadvantage is one more global name >>>>> space to manage. >>>>> >>>>>>>> Keeping it in a common place also makes it more visible to people >>>>>>>> looking to add new hcalls, which could cut down on reinventing the >>>>>>>> wheel. >>>>>>> I do not want other arches to start using hypercalls in the way >>>>>>> powerpc >>>>>>> started to use them: separate device io space, so it is better to hide >>>>>>> this as far away from common code as possible :) But on a more serious >>>>>>> note hypercalls should be a last resort and added only when no other >>>>>>> possibility exists, so people should not look what hcalls others >>>>>>> implemented, so they can add them to their favorite arch, but they >>>>>>> should have a problem at hand that they cannot solve without >>>>>>> hcall, but >>>>>>> at this point they will have pretty good idea what this hcall >>>>>>> should do. >>>>>> >>>>>> Why are hcalls such a bad thing? >>>>>> >>>>> Because they often used to do non architectural things making OSes >>>>> behave different from how they runs on real HW and real HW is what >>>>> OSes are designed and tested for. Example: there once was a KVM (XEN >>>>> have/had similar one) hypercall to accelerate MMU operation. One thing it >>>>> allowed is to to flush tlb without doing IPI if vcpu is not running. Later >>>>> optimization was added to Linux MMU code that _relies_ on those IPIs for >>>>> synchronisation. Good that at that point those hypercalls were already >>>>> deprecated on KVM (IIRC XEN was broke for some time in that regard). Which >>>>> brings me to another point: they often get obsoleted by code improvement >>>>> and HW advancement (happened to aforementioned MMU hypercalls), but they >>>>> hard to deprecate if hypervisor supports live migration, without live >>>>> migration it is less of a problem. Next point is that people often try >>>>> to use them instead of emulate PV or real device just because they >>>>> think it is easier, but it is often not so. Example: pvpanic device was >>>>> initially proposed as hypercall, so lets say we would implement it as >>>>> such. It would have been KVM specific, implementation would touch core >>>>> guest KVM code and would have been Linux guest specific. Instead it was >>>>> implemented as platform device with very small platform driver confined >>>>> in drivers/ directory, immediately usable by XEN and QEMU tcg in addition >>>> >>>> This is actually a very good point. How do we support reboot and shutdown for TCG guests? We surely >>>> don't want to expose TCG as KVM hypervisor. >>> >>> Hmm...so are you proposing that we abandon the current approach, >>> and switch to a device-based mechanism for reboot/shutdown? >> >> Reading Gleb's email it sounds like the more future proof approach, yes. I'm not quite sure yet where we >> should plug this though. > > What do you mean...where the paravirt device would go in the physical > address map?? Right. Either we - let the guest decide (PCI) - let QEMU decide, but potentially break the SoC layout (SysBus) - let QEMU decide, but only for the virt machine so that we don't break anyone (PlatBus) Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html