On 09.05.2012, at 21:38, Eduardo Habkost wrote: > On Wed, May 09, 2012 at 12:38:37PM +0300, Gleb Natapov wrote: >> On Wed, May 09, 2012 at 11:05:58AM +0200, Alexander Graf wrote: >>> >>> On 09.05.2012, at 10:51, Gleb Natapov wrote: >>> >>>> On Wed, May 09, 2012 at 10:42:26AM +0200, Alexander Graf wrote: >>>>> >>>>> >>>>> On 09.05.2012, at 10:14, Gleb Natapov <gleb@xxxxxxxxxx> wrote: >>>>> >>>>>> On Wed, May 09, 2012 at 12:07:04AM +0200, Alexander Graf wrote: >>>>>>> >>>>>>> On 08.05.2012, at 22:14, Eduardo Habkost wrote: >>>>>>> >>>>>>>> On Tue, May 08, 2012 at 02:58:11AM +0200, Alexander Graf wrote: >>>>>>>>> On 07.05.2012, at 20:21, Eduardo Habkost wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Andre? Are you able to help to answer the question below? >>>>>>>>>> >>>>>>>>>> I would like to clarify what's the expected behavior of "-cpu host" to >>>>>>>>>> be able to continue working on it. I believe the code will need to be >>>>>>>>>> fixed on either case, but first we need to figure out what are the >>>>>>>>>> expectations/requirements, to know _which_ changes will be needed. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Apr 24, 2012 at 02:19:25PM -0300, Eduardo Habkost wrote: >>>>>>>>>>> (CCing Andre Przywara, in case he can help to clarify what's the >>>>>>>>>>> expected meaning of "-cpu host") >>>>>>>>>>> >>>>>>>>>> [...] >>>>>>>>>>> I am not sure I understand what you are proposing. Let me explain the >>>>>>>>>>> use case I am thinking about: >>>>>>>>>>> >>>>>>>>>>> - Feature FOO is of type (A) (e.g. just a new instruction set that >>>>>>>>>>> doesn't require additional userspace support) >>>>>>>>>>> - User has a Qemu vesion that doesn't know anything about feature FOO >>>>>>>>>>> - User gets a new CPU that supports feature FOO >>>>>>>>>>> - User gets a new kernel that supports feature FOO (i.e. has FOO in >>>>>>>>>>> GET_SUPPORTED_CPUID) >>>>>>>>>>> - User does _not_ upgrade Qemu. >>>>>>>>>>> - User expects to get feature FOO enabled if using "-cpu host", without >>>>>>>>>>> upgrading Qemu. >>>>>>>>>>> >>>>>>>>>>> The problem here is: to support the above use-case, userspace need a >>>>>>>>>>> probing mechanism that can differentiate _new_ (previously unknown) >>>>>>>>>>> features that are in group (A) (safe to blindly enable) from features >>>>>>>>>>> that are in group (B) (that can't be enabled without an userspace >>>>>>>>>>> upgrade). >>>>>>>>>>> >>>>>>>>>>> In short, it becomes a problem if we consider the following case: >>>>>>>>>>> >>>>>>>>>>> - Feature BAR is of type (B) (it can't be enabled without extra >>>>>>>>>>> userspace support) >>>>>>>>>>> - User has a Qemu version that doesn't know anything about feature BAR >>>>>>>>>>> - User gets a new CPU that supports feature BAR >>>>>>>>>>> - User gets a new kernel that supports feature BAR (i.e. has BAR in >>>>>>>>>>> GET_SUPPORTED_CPUID) >>>>>>>>>>> - User does _not_ upgrade Qemu. >>>>>>>>>>> - User simply shouldn't get feature BAR enabled, even if using "-cpu >>>>>>>>>>> host", otherwise Qemu would break. >>>>>>>>>>> >>>>>>>>>>> If userspace always limited itself to features it knows about, it would >>>>>>>>>>> be really easy to implement the feature without any new probing >>>>>>>>>>> mechanism from the kernel. But that's not how I think users expect "-cpu >>>>>>>>>>> host" to work. Maybe I am wrong, I don't know. I am CCing Andre, who >>>>>>>>>>> introduced the "-cpu host" feature, in case he can explain what's the >>>>>>>>>>> expected semantics on the cases above. >>>>>>>>> >>>>>>>>> Can you think of any feature that'd go into category B? >>>>>>>> >>>>>>>> - TSC-deadline: can't be enabled unless userspace takes care to enable >>>>>>>> the in-kernel irqchip. >>>>>>> >>>>>>> The kernel can check if in-kernel irqchip has it enabled and otherwise mask it out, no? >>>>>>> >>>>>> How kernel should know that userspace does not emulate it? >>>>> >>>>> You have to enable the in-kernel apic to use it, at which point the kernel knows it's in use, right? >>>>> >>>>>> >>>>>>>> - x2apic: ditto. >>>>>>> >>>>>>> Same here. For user space irqchip the kernel side doesn't care. If in-kernel APIC is enabled, check for its capabilities. >>>>>>> >>>>>> Same here. >>>>>> >>>>>> Well, technically both of those features can't be implemented in >>>>>> userspace right now since MSRs are terminated in the kernel, but I >>>>> >>>>> Doesn't sound like the greatest design - unless you deprecate the non-in-kernel apic case. >>>>> >>>> You mean terminating MSRs in kernel does not sound like the greatest >>>> design? I do not disagree. That is why IMO kernel can't filter out >>>> TSC-deadline and x2apic like you suggest. >>> >>> I still don't see why it can't. >>> >>> Imagine we would filter TSC-deadline and x2apic by default in the kernel - they are not known to exist yet. >>> Now, we implement TSC-deadline in the kernel. We still filter >>> TSC-deadline out in GET_SUPORTED_CPUID in the kernel. But we provide >>> an interface to user space that says "call me to enable TSC-deadline >>> CPUID, but only if you're using the in-kernel apic" > > We have that interface already, it is called KVM_SET_CPUID. :-) > >>> New user space calls that ioctl when it's using the in-kernel apic, it doesn't when it's using the user space apic. >>> Old user space doesn't call that ioctl. >> First of all we already have TSC-deadline in GET_SUPORTED_CPUID without >> any additional ioctls. And second I do not see why we need additional >> iostls here. > > We don't have TSC-deadline set today (and that's what started this > thread), but we have x2apic. So what you say is true for x2apic, anyway. > >> Hmm, so may be I misunderstood you. You propose to mask TSC-deadline and >> x2apic out from GET_SUPORTED_CPUID if irq chip is not in kernel, not >> from KVM_SET_CPUID? For those two features it may make sense indeed. > > It makes sense to me. > > It looks like my assumptions were wrong. They were: > > - GET_SUPPORTED_CPUID simply can't know if the in-kernel irqchip is > going to be enabled or not. > - GET_SUPPORTED_CPUID output has to be a function of the kernel code > capabilitie and host CPU, and not depend on any input from userspace. > > Are those assumptions incorrect? If we break them, we may try what > Alexander is proposing. It would be much more flexible than the options > I was considering. > > I didn't know ENABLE_CAP existed. Even if GET_SUPPORTED_CPUID can't > check for the in-kernel irqchip setup for some reason, ENABLE_CAP could > be used by userpace to tell the kernel "I will enable the in-kernel > irqchip, so feel free to return features that depend on it on > GET_SUPPORTED_CPUID". > > In other words, we would return only the type-A features on > GET_SUPPORTED_CPUID (i.e. safe to be blindly enabled by -cpu host as > long as migration is not required), but if we use ENABLE_CAP we can make > group A safely grow, as long as userspace first tells the kernel what it > supports. > > Anybody is against doing that? Otherwise I plan to work on this. I think it makes a lot of sense to go this route. But get your blessing from Avi first - ultimately he owns the API :). Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html