On Thu, Jun 14, 2012 at 07:02:03PM +0000, Liu, Jinsong wrote: > Eduardo, Jan > > I will update tsc deadline timer patch (at qemu-kvm side) recently. > Have you made a final agreement of the issue 'KVM_CAP_TSC_DEADLINE_TIMER' vs. 'GET_SUPPORTED_CPUID'? I don't think there's a final agreement, but I was convinced later that it's probably better to _not_ have TSC-deadline on GET_SUPPORTED_CPUID, at least not by default. Even if this is changed in the future, it's a good idea to make qemu support the KVM_CAP_TSC_DEADLINE_TIMER method if running on older kernels, anyway. > Eduardo Habkost wrote: > > (CCing Andre Przywara, in case he can help to clarify what's the > > expected meaning of "-cpu host") > > > > On Tue, Apr 24, 2012 at 06:06:55PM +0200, Jan Kiszka wrote: > >> On 2012-04-23 22:02, Eduardo Habkost wrote: > >>> On Mon, Apr 23, 2012 at 06:31:25PM +0200, Jan Kiszka wrote: > >>>> However, that was how I interpreted this GET_SUPPORTED_CPUID. In > >>>> fact, > >>>> it is used as "kernel or hardware does not _prevent_" already. And > >>>> in that sense, it's ok to enable even features that are not in > >>>> kernel/hardware hands. We should point out this fact in the > >>>> documentation. > >>> > >>> I see GET_SUPPORTED_CPUID as just a "what userspace can enable > >>> because the kernel and the hardware support it (= don't prevent > >>> it), as long as userspace has the required support" (meaning A+B). > >>> It's a bit like KVM_CHECK_EXTENSION, but with the nice feature that > >>> that the capabilities map directly to CPUID bits. > >>> > >>> So, it's not clear to me: now you are OK with adding TSC_DEADLINE > >>> to GET_SUPPORTED_CPUID? > >>> > >>> But we still have the issue of "-cpu host" not knowing what can be > >>> safely enabled (without userspace feature-specific setup code), or > >>> not. Do you have any suggestion for that? Avi, do you have any > >>> suggestion? > >> > >> First of all, I bet this was already broken with the introduction of > >> x2apic. So TSC deadline won't make it worse. I guess we need to > >> address this in userspace, first by masking those features out, > >> later by actually emulating them. > > > > I am not sure I understand what you are proposing. Let me explain the > > use case I am thinking about: > > > > - Feature FOO is of type (A) (e.g. just a new instruction set that > > doesn't require additional userspace support) > > - User has a Qemu vesion that doesn't know anything about feature FOO > > - User gets a new CPU that supports feature FOO > > - User gets a new kernel that supports feature FOO (i.e. has FOO in > > GET_SUPPORTED_CPUID) > > - User does _not_ upgrade Qemu. > > - User expects to get feature FOO enabled if using "-cpu host", > > without upgrading Qemu. > > > > The problem here is: to support the above use-case, userspace need a > > probing mechanism that can differentiate _new_ (previously unknown) > > features that are in group (A) (safe to blindly enable) from features > > that are in group (B) (that can't be enabled without an userspace > > upgrade). > > > > In short, it becomes a problem if we consider the following case: > > > > - Feature BAR is of type (B) (it can't be enabled without extra > > userspace support) > > - User has a Qemu version that doesn't know anything about feature BAR > > - User gets a new CPU that supports feature BAR > > - User gets a new kernel that supports feature BAR (i.e. has BAR in > > GET_SUPPORTED_CPUID) > > - User does _not_ upgrade Qemu. > > - User simply shouldn't get feature BAR enabled, even if using "-cpu > > host", otherwise Qemu would break. > > > > If userspace always limited itself to features it knows about, it > > would be really easy to implement the feature without any new probing > > mechanism from the kernel. But that's not how I think users expect > > "-cpu host" to work. Maybe I am wrong, I don't know. I am CCing > > Andre, who introduced the "-cpu host" feature, in case he can explain > > what's the expected semantics on the cases above. > > > >> > >>> > >>> And I still don't know the answer to: > >>> > >>>>> - How to precisely define the groups (A) and (B)? > >>>>> - "requires additional code only if migration is required" > >>>>> qualifies as (B) or (A)? > >>> > >>> > >>> Re: documentation, isn't the following paragraph (already present > >>> on api.txt) sufficient? > >>> > >>> "The entries returned are the host cpuid as returned by the cpuid > >>> instruction, with unknown or unsupported features masked out. Some > >>> features (for example, x2apic), may not be present in the host cpu, > >>> but are exposed by kvm if it can emulate them efficiently." > >> > >> That suggests such features are always emulated - which is not true. > >> They are either emulated, or nothing _prevents_ their emulation by > >> user space. > > > > Well... it's a bit more complicated than that: the current semantics > > are a bit more than "doesn't prevent", as in theory every single > > feature can be emulated by userspace, without any help from the > > kernel. So, if "doesn't prevent" were the only criteria, the kernel > > would set every single feature bit on GET_SUPPORTED_CPUID, making it > > not very useful. > > > > At least in the case of x2apic, the kernel is using > > GET_SUPPORTED_CPUID to expose a _capability_ too: when x2apic is > > present on GET_SUPPORTED_CPUID, userspace knows that in addition to > > "not preventing" the feature from being enabled, the kernel is now > > able to emulate x2apic (if proper setup is made by userspace). A > > kernel that can't emulate x2apic (even if userspace was allowed to > > emulate it completely in userspace) would never have x2apic enabled on > > GET_SUPPORTED_CPUID. > > > > Like I said previously, in the end GET_SUPPORTED_CPUID is just a > > capability querying mechanism like KVM_CHECK_EXTENSION (where each > > extension have a specific kernel-capability meaning), but with the > > nice feature of being automatically mapped to CPUID bits (with no > > need for additional KVM_CAP_* => CPUID translation in userspace). > -- Eduardo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html