On 3/3/25 02:11, Yan Zhao wrote:
the main issue with this series is that the quirk is not disabled only for
TDX VMs, but for *all* VMs if TDX is available.
Yes, once TDX is enabled, the quirk is disabled for all VMs.
My thought is that on TDX as a new platform, users have the option to update
guest software to address bugs caused by incorrect guest PAT settings.
If you think it's a must to support old unmodifiable non-TDX VMs on TDX
platforms, then it's indeed an issue of this series.
Yeah, unfortunately I think we need to keep the quirk for old VMs. But
I think the code changes needed to do so are small and good to have anyway.
There are two concepts here:
- which quirks can be disabled
- which quirks are active
I agree with making the first vendor-dependent, but for a different reason:
the new KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT must be hidden if self-snoop is
not present.
I think it's a good idea to make KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT out of
KVM_CAP_DISABLE_QUIRKS2, so that the quirk is always enabled when self-snoop is
not present as userspace has no way to disable this quirk.
However, this seems to contradict your point below, especially since it is even
present on AMD platforms.
"we need to expose the quirk anyway in KVM_CAP_DISABLE_QUIRKS2, so that
userspace knows that KVM is *aware* of a particular issue", "even if disabling
it has no effect, userspace may want to know that it can rely on the problematic
behavior not being present".
There are four cases:
* quirk cannot be disabled: example, "ignore guest PAT" on
non-self-snoop machines: the quirk must not be in KVM_CAP_DISABLE_QUIRKS2
* quirk can be disabled: the quirk must be in KVM_CAP_DISABLE_QUIRKS2
* quirk is always disabled: right now we're always exposing those in
KVM_CAP_DISABLE_QUIRKS2, so we should keep that behavior. If desired we
could add a capability like KVM_CAP_DISABLED_QUIRKS
* for some VMs, quirk is always disabled: this is the case also for the
zap_all quirk that you have previously introduced. Right now there's no
way to query it, but KVM_CAP_DISABLED_QUIRKS would also cover this. If
KVM_CAP_DISABLED_QUIRKS was introduced, zap_all could be added too.
So, could we also expose KVM_X86_QUIRK_EPT_IGNORE_GUEST_PAT in
KVM_CAP_DISABLE_QUIRKS2 on Intel platforms without self-snoop, but ensure that
disabling the quirk has no effect?
To keep the API clear, disabling the quirk should *always* have the
effect of going to the non-quirky behavior. Which may be no effect at
all if the non-quirky behavior is the only one---but the important thing
is that you don't want the quirky/buggy/non-architectural behavior after
a successful KVM_ENABLE_CAP(KVM_CAP_DISABLE_QUIRKS2).
There is a pre-existing bug in that I think
KVM_ENABLE_CAP(KVM_CAP_DISABLE_QUIRKS2) should be cumulative, i.e.
should not allow re-enabling a previously-disabled quirk. I think we
can change that without worrying about breaking userspace there, as the
current behavior is the most surprising.
As to the second, we already have an example of a quirk that is also active,
though we don't represent that in kvm->arch.disabled_quirks: that's
KVM_X86_QUIRK_CD_NW_CLEARED which is for AMD only and is effectively always
disabled on Intel platforms. For those cases, we need to expose the quirk
I also have a concern about this one. Please find my comments in v2.
Ok, I'll reply there too.
anyway in KVM_CAP_DISABLE_QUIRKS2, so that userspace knows that KVM is
*aware* of a particular issue. In other words, even if disabling it has no
effect, userspace may want to know that it can rely on the problematic
behavior not being present.
I'm testing an alternative series and will post it shortly.
Thanks a lot for helping with refining the patches!
Thanks to you and sorry that the patches weren't of the best quality - I
mostly wanted to start the discussion on the userspace API side before
the beginning of the week in your time zone.
Paolo