RE: [PATCH 3/5] booke: define reset and shutdown hcalls

Bhushan Bharat-R65777 <R65777@xxxxxxxxxxxxx> · Wed, 17 Jul 2013 15:47:50 +0000

> >>>> On 17.07.2013, at 13:00, Gleb Natapov wrote:
> >>>>
> >>>>> On Tue, Jul 16, 2013 at 06:04:34PM -0500, Scott Wood wrote:
> >>>>>> On 07/16/2013 01:35:55 AM, Gleb Natapov wrote:
> >>>>>>> On Mon, Jul 15, 2013 at 01:17:33PM -0500, Scott Wood wrote:
> >>>>>>>> On 07/15/2013 06:30:20 AM, Gleb Natapov wrote:
> >>>>>>>>> There is no much sense to share hypercalls between architectures.
> >>>>>>>>> There
> >>>>>>>>> is zero probability x86 will implement those for instance
> >>>>>>>>
> >>>>>>>> This is similar to the question of whether to keep device API
> >>>>>>>> enumerations per-architecture...  It costs very little to keep
> >>>>>>>> it in a common place, and it's hard to go back in the other
> >>>>>>>> direction if we later realize there are things that should be shared.
> >>>>>>>>
> >>>>>>> This is different from device API since with device API all
> >>>>>>> arches have to create/destroy devices, so it make sense to put
> >>>>>>> device lifecycle management into the common code, and device API
> >>>>>>> has single entry point to the code - device fd ioctl - where it
> >>>>>>> makes sense to handle common tasks, if any, and despatch others
> >>>>>>> to specific device implementation.
> >>>>>>>
> >>>>>>> This is totally unlike hypercalls which are, by definition, very
> >>>>>>> architecture specific (the way they are triggered, the way
> >>>>>>> parameter are passed from guest to host, what hypercalls arch needs...).
> >>>>>>
> >>>>>> The ABI is architecture specific.  The API doesn't need to be,
> >>>>>> any more than it does with syscalls (I consider the
> >>>>>> architecture-specific definition of syscall numbers and similar
> >>>>>> constants in Linux to be unfortunate, especially for tools such
> >>>>>> as strace or QEMU's linux-user emulation).
> >>>>>>
> >>>>> Unlike syscalls different arches have very different ideas what
> >>>>> hypercalls they need to implement, so while with unified syscall
> >>>>> space I can see how it may benefit (very) small number of tools, I
> >>>>> do not see what advantage it will give us. The disadvantage is one
> >>>>> more global name space to manage.
> >>>>>
> >>>>>>>> Keeping it in a common place also makes it more visible to
> >>>>>>>> people looking to add new hcalls, which could cut down on
> >>>>>>>> reinventing the wheel.
> >>>>>>> I do not want other arches to start using hypercalls in the way
> >>>>>>> powerpc started to use them: separate device io space, so it is
> >>>>>>> better to hide this as far away from common code as possible :)
> >>>>>>> But on a more serious note hypercalls should be a last resort
> >>>>>>> and added only when no other possibility exists, so people
> >>>>>>> should not look what hcalls others implemented, so they can add
> >>>>>>> them to their favorite arch, but they should have a problem at
> >>>>>>> hand that they cannot solve without hcall, but at this point
> >>>>>>> they will have pretty good idea what this hcall should do.
> >>>>>>
> >>>>>> Why are hcalls such a bad thing?
> >>>>>>
> >>>>> Because they often used to do non architectural things making OSes
> >>>>> behave different from how they runs on real HW and real HW is what
> >>>>> OSes are designed and tested for. Example: there once was a KVM
> >>>>> (XEN have/had similar one) hypercall to accelerate MMU operation.
> >>>>> One thing it allowed is to to flush tlb without doing IPI if vcpu
> >>>>> is not running. Later optimization was added to Linux MMU code
> >>>>> that _relies_ on those IPIs for synchronisation. Good that at that
> >>>>> point those hypercalls were already deprecated on KVM (IIRC XEN
> >>>>> was broke for some time in that regard). Which brings me to
> >>>>> another point: they often get obsoleted by code improvement and HW
> >>>>> advancement (happened to aforementioned MMU hypercalls), but they
> >>>>> hard to deprecate if hypervisor supports live migration, without
> >>>>> live migration it is less of a problem. Next point is that people
> >>>>> often try to use them instead of emulate PV or real device just
> >>>>> because they think it is easier, but it is often not so. Example:
> >>>>> pvpanic device was initially proposed as hypercall, so lets say we
> >>>>> would implement it as such. It would have been KVM specific,
> >>>>> implementation would touch core guest KVM code and would have been
> >>>>> Linux guest specific. Instead it was implemented as platform
> >>>>> device with very small platform driver confined in drivers/
> >>>>> directory, immediately usable by XEN and QEMU tcg in addition
> >>>>
> >>>> This is actually a very good point. How do we support reboot and
> >>>> shutdown for TCG guests? We surely don't want to expose TCG as KVM
> hypervisor.
> >>>
> >>> Hmm...so are you proposing that we abandon the current approach, and
> >>> switch to a device-based mechanism for reboot/shutdown?
> >>
> >> Reading Gleb's email it sounds like the more future proof approach,
> >> yes. I'm not quite sure yet where we should plug this though.
> >
> > What do you mean...where the paravirt device would go in the physical
> > address map??
> 
> Right. Either we
> 
>   - let the guest decide (PCI)
>   - let QEMU decide, but potentially break the SoC layout (SysBus)
>   - let QEMU decide, but only for the virt machine so that we don't break anyone
> (PlatBus)

Can you please elaborate above two points ?

-Bharat

> 
> 
> Alex
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html