On Tue, Mar 12, 2024, Kevin Tian wrote: > > From: Sean Christopherson <seanjc@xxxxxxxxxx> > > Sent: Tuesday, March 12, 2024 8:26 AM > > > > On Mon, Mar 11, 2024, Yan Zhao wrote: > > > For the case of !static_cpu_has(X86_FEATURE_SELFSNOOP) && > > > kvm_arch_has_noncoherent_dma(vcpu->kvm), I think we at least should warn > > > about unsafe before honoring guest memory type. > > > > I don't think it gains us enough to offset the potential pain such a > > message would bring. Assuming the warning isn't outright ignored, the most > > likely scenario is that the warning will cause random end users to worry > > that the setup they've been running for years is broken, when in reality > > it's probably just fine for their > > use case. > > Isn't the 'worry' necessary to allow end users evaluate whether "it's > probably just fine for their use case"? Realistically, outside of large scale deployments, no end user is going to be able to make that evaluation, because practically speaking it requires someone with quite low-level hardware knowledge to be able to make that judgment call. And counting by number of human end users (as opposed to number of VMs being run), I am willing to bet that the overwhelming majority of KVM users aren't kernel or systems engineers. Understandably, users tend to be alarmed by (or suspicious of) new warnings that show up. E.g. see the ancient KVM_SET_TSS_ADDR pr_warn[*]. And recently, we had an internal bug report filed against KVM because they observed a performance regression when booting a KVM guest, and saw a new message about some CPU vulnerability being mitigated on VM-Exit that showed up in their *guest* kernel. In short, my concern is that adding a new pr_warn() will generate noise for end users *and* for KVM developers/maintainers, because even if we phrase the message to talk specifically about "untrusted workloads", the majority of affected users will not have the necessary knowledge to make an informed decision. [*] https://lore.kernel.org/all/f1afa6c0-cde2-ab8b-ea71-bfa62a45b956@xxxxxxxxx > I saw the old comment already mentioned that doing so may lead to unexpected > behaviors. But I'm not sure whether such code-level caveat has been visible > enough to end users. Another point to consider: KVM is _always_ potentially broken on such CPUs, as KVM forces WB for guest accesses. I.e. KVM will create memory aliasing if the host has guest memory mapped as non-WB in the PAT, without non-coherent DMA exposed to the guest. > > I would be quite surprised if there are people running untrusted workloads > > on 10+ year old silicon *and* have passthrough devices and non-coherent > > IOMMUs/DMA. > > this is probably true. > > > And anyone exposing a device directly to an untrusted workload really > > should have done their homework. > > or they run trusted workloads which might be tampered by virus to > exceed the scope of their homework. 😊 If a workload is being run in a KVM guest for host isolation/security purposes, and a device was exposed to said workload, then I would firmly consider analyzing the impact of a compromised guest to be part of their homework. > > And it's not like we're going to change KVM's historical behavior at this point. > > I agree with your point of not breaking userspace. But still think a warning > might be informative to let users evaluate their setup against a newly > identified "unexpected behavior" which has security implication beyond > the guest, while the previous interpretation of "unexpected behavior" > might be that the guest can at most shoot its own foot... If this issue weren't limited to 10+ year old hardware, I would be more inclined to add a message. But at this point, realistically the only thing KVM would be saying is "you're running old hardware, that might be unsafe in today's world". For users that care about security, we'd be telling them something they already know (and if they don't know, they've got bigger problems). And for everyone else, it'd be scary noise without any meaningful benefit.