On Wed, Feb 19, 2020 at 12:19:01PM +0100, Erwan Velu wrote: > On 18/02/2020 19:48, Sean Christopherson wrote: > [...] > >Fix userspace to only do the "add" on one CPU. > > > >Changing kvm_arch_init() to use pr_err_once() for the disabled_by_bios() > >case "works", but it's effectively a hack to workaround a flawed userspace. > > I'll see with the user space tool to sort this out. > > I'm also considering how "wrong" is what they do: udevadm trigger is > generating 3500 "uevent add" on my system and I only noticed kvm to print > this noisy message. > > So as the print once isn't that "wrong" neither, this simple patch would > avoid polluting the kernel logs. > > > So my proposal would be > > - have this simple patch on the kernel side to avoid having userspace apps > polluting logs > > - contacting the udev people to see the rational & fix it too : I'll do that > > > As you said, once probed, there is no need reprinting the same message again > as the situation cannot have changed. For this exact scenario, on Intel/VMX, this is mostly true. But, the MSR check for AMD/SVM has a disable bit that takes effect irrespective of the MSR's locked bit, i.e. SVM could theoretically change state without any super special behavior. Even on Intel, the state can potentially change, especially on a system with a misbehaving BIOS. FEATURE_CONTROL is cleared on CPU RESET, e.g. VMX enabling could change if BIOS "forgets" to reinitialize the MSR upon waking from S3 (suspend). Things get really weird if we consider the case where BIOS leaves the MSR unlocked after S3, the user manually writes the MSR, and then it gets cleared again on a different S3 transition. SVM is even more sensitive because VM_CR is cleared on INIT, not just RESET. > As we are on the preliminary return code path (to make a EOPNOTSUPP), I > would vote for protecting the print against multiple calls/prints. > > The kernel patch isn't impacting anyone (in a regular case) and just avoid > pollution. > > Would you agree on that ? Sadly, no. Don't get me wrong, I completely agree that, ideally, KVM would not spam the log, even when presented with a misbehaving userspace. My hesitation about changing the error message to pr_err_once() isn't so much about right versus wrong as it is about creating misleading and potentially confusing code in KVM, and setting a precedent that I don't think we want to carry forward. E.g. the _once() doesn't hold true if module kvm is unloaded and other error messages such as basic CPU support would still be unlimited. The basic CPU support message definitely should *not* be _once() as that would squash messages when loading the wrong vendor module. As for setting a precedent, moving the error message to the vendor module or making kvm a monolithic module would "break" the _once() behavior. And, the current systemd behavior is actually quite dangerous, e.g. on a misconfigured system where SVM is enabled on a subset of CPUs, probing KVM on every CPU essentially guarantees that KVM will be loaded on a broken system. In that case, I think we actually want the spam. Note, as of kernel 5.6, this doesn't apply to VMX as kvm_intel will no longer load on a misconfigured system since FEATURE_CONTROL configuration is incorporated into the per-CPU checks. All of that being said, what about converting all of the error messages to pr_err_ratelimited()? That would take the edge off this particular problem, wouldn't create incosistencies between error messages, and won't completely squash error messages in corner case scenarios on misconfigured systems.