On Wed, 2024-05-29 at 08:01 -0700, Sean Christopherson wrote: > On Tue, May 28, 2024, Kai Huang wrote: > > On 24/05/2024 2:39 pm, Chao Gao wrote: > > > On Fri, May 24, 2024 at 11:11:37AM +1200, Huang, Kai wrote: > > > > > > > > > > > > On 23/05/2024 4:23 pm, Chao Gao wrote: > > > > > On Thu, May 23, 2024 at 10:27:53AM +1200, Huang, Kai wrote: > > > > > > > > > > > > > > > > > > On 22/05/2024 2:28 pm, Sean Christopherson wrote: > > > > > > > Add an off-by-default module param, enable_virt_at_load, to let userspace > > > > > > > force virtualization to be enabled in hardware when KVM is initialized, > > > > > > > i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization > > > > > > > during KVM initialization allows userspace to avoid the additional latency > > > > > > > when creating/destroying the first/last VM. Now that KVM uses the cpuhp > > > > > > > framework to do per-CPU enabling, the latency could be non-trivial as the > > > > > > > cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could > > > > > > > be problematic for use case that need to spin up VMs quickly. > > > > > > > > > > > > How about we defer this until there's a real complain that this isn't > > > > > > acceptable? To me it doesn't sound "latency of creating the first VM" > > > > > > matters a lot in the real CSP deployments. > > > > > > > > > > I suspect kselftest and kvm-unit-tests will be impacted a lot because > > > > > hundreds of tests are run serially. And it looks clumsy to reload KVM > > > > > module to set enable_virt_at_load to make tests run faster. I think the > > > > > test slowdown is a more realistic problem than running an off-tree > > > > > hypervisor, so I vote to make enabling virtualization at load time the > > > > > default behavior and if we really want to support an off-tree hypervisor, > > > > > we can add a new module param to opt in enabling virtualization at runtime. > > I definitely don't object to making it the default behavior, though I would do so > in a separate patch, e.g. in case enabling virtualization by default somehow > causes problems. > > We could also add a Kconfig to control the default behavior, though IMO that'd be > overkill without an actual use case for having virtualization off by default. > > > > > I am not following why off-tree hypervisor is ever related to this. > > > > > > Enabling virtualization at runtime was added to support an off-tree hypervisor > > > (see the commit below). > > > > Oh, ok. I was thinking something else. > > > > > > > > > > The problem of enabling virt during module loading by default is it impacts > > > > all ARCHs. > > Pratically speaking, Intel is the only vendor where enabling virtualization is > interesting enough for anyone to care. On SVM and all other architectures, > enabling virtualization doesn't meaningfully change the functionality of the > current mode. > > And impacting all architectures isn't a bad thing. Yes, it requires getting buy-in > from more people, and maybe additional testing, though in this case we should get > that for "free" by virtue of being in linux-next. But those are one-time costs, > and not particular high costs. > > > > > Given this performance downgrade (if we care) can be resolved by > > > > explicitly doing on_each_cpu() below, I am not sure why we want to choose > > > > this radical approach. > > Because it's not radical? And manually doing on_each_cpu() requires complexity > that we would ideally avoid. > > 15+ years ago, when VMX and SVM were nascent technologies, there was likely a > good argument from a security perspective for leaving virtualization disabled. > E.g. the ucode flows were new _and_ massive, and thus a potentially juicy attack > surface. > > But these days, AFAIK enabling virtualization is not considered to be a security > risk, nor are there performance or stability downsides. E.g. it's not all that > different than the kernel enabling CR4.PKE even though it's entirely possible > userspace will never actually use protection keys. Agree this is not a security risk. If all other ARCHs are fine to enable on module loading then I don't see any real problem. > > > > IIUC, we plan to set up TDX module at KVM load time; we need to enable virt > > > at load time at least for TDX. Definitely, on_each_cpu() can solve the perf > > > concern. But a solution which can also satisfy TDX's need is better to me. > > > > > > > Doing on_each_cpu() explicitly can also meet TDX's purpose. We just > > explicitly enable virtualization during module loading if we are going to > > enable TDX. For all other cases, the behaivour remains the same, unless > > they want to change when to enable virtualization, e.g., when loading module > > by default. > > > > For always, or by default enabling virtualization during module loading, we > > somehow discussed before: > > > > https://lore.kernel.org/kvm/ZiKoqMk-wZKdiar9@xxxxxxxxxx/ > > > > My true comment is introducing a module parameter, which is a userspace ABI, > > Module params aren't strictly ABI, and even if they were, this would only be > problematic if we wanted to remove the module param *and* doing so was a breaking > change. > Yes. The concern is once introduced I don't think we can easily remove it even it becomes useless. > Enabling virtualization should be entirely transparent to userspace, > at least from a functional perspective; if changing how KVM enables virtualization > breaks userspace then we likely have bigger problems. I am not sure how should I interpret this? "having a module param" doesn't necessarily mean "entirely transparent to userspace", right? :-) > > > to just fix some performance downgrade seems overkill when it can be > > mitigated by the kernel. > > Performance is secondary for me, the primary motivation is simplifying the overall > KVM code base. Yes, we _could_ use on_each_cpu() and enable virtualization > on-demand for TDX, but as above, it's extra complexity without any meaningful > benefit, at least AFAICT. Either way works for me. I just think using a module param to resolve some problem while there can be solution completely in the kernel seems overkill :-)