On 10/3/2023 1:46 PM, Peter Zijlstra wrote: > On Mon, Oct 02, 2023 at 05:56:28PM -0700, Sean Christopherson wrote: >> On Mon, Oct 02, 2023, Peter Zijlstra wrote: > >>> I'm not sure what you're suggesting here. It will have to save/restore >>> all those MSRs anyway. Suppose it switches between vCPUs. >> >> The "when" is what's important. If KVM took a literal interpretation of >> "exclude guest" for pass-through MSRs, then KVM would context switch all those >> MSRs twice for every VM-Exit=>VM-Enter roundtrip, even when the VM-Exit isn't a >> reschedule IRQ to schedule in a different task (or vCPU). The overhead to save >> all the host/guest MSRs and load all of the guest/host MSRs *twice* for every >> VM-Exit would be a non-starter. E.g. simple VM-Exits are completely handled in >> <1500 cycles, and "fastpath" exits are something like half that. Switching all >> the MSRs is likely 1000+ cycles, if not double that. > > See, you're the virt-nerd and I'm sure you know what you're talking > about, but I have no clue :-) I didn't know there were different levels > of vm-exit. > >> FWIW, the primary use case we care about is for slice-of-hardware VMs, where each >> vCPU is pinned 1:1 with a host pCPU. > > I've been given to understand that vm-exit is a bad word in this > scenario, any exit is a fail. They get MWAIT and all the other crap and > more or less pretend to be real hardware. > > So why do you care about those MSRs so much? That should 'never' happen > in this scenario. > >>>> Or at least, that was my reading of things. Maybe it was just a >>>> misunderstanding because we didn't do a good job of defining the behavior. >>> >>> This might be the case. I don't particularly care where the guest >>> boundary lies -- somewhere in the vCPU thread. Once the thread is gone, >>> PMU is usable again etc.. >> >> Well drat, that there would have saved a wee bit of frustration. Better late >> than never though, that's for sure. >> >> Just to double confirm: keeping guest PMU state loaded until the vCPU is scheduled >> out or KVM exits to userspace, would mean that host perf events won't be active >> for potentially large swaths of non-KVM code. Any function calls or event/exception >> handlers that occur within the context of ioctl(KVM_RUN) would run with host >> perf events disabled. > > Hurmph, that sounds sub-optimal, earlier you said <1500 cycles, this all > sounds like a ton more. > > /me frobs around the kvm code some... > > Are we talking about exit_fastpath loop in vcpu_enter_guest() ? That > seems to run with IRQs disabled, so at most you can trigger a #PF or > something, which will then trip an exception fixup because you can't run > #PF with IRQs disabled etc.. > > That seems fine. That is, a theoretical kvm_x86_handle_enter_irqoff() > coupled with the existing kvm_x86_handle_exit_irqoff() seems like > reasonable solution from where I'm sitting. That also more or less > matches the FPU state save/restore AFAICT. > > Or are you talking about the whole of vcpu_run() ? That seems like a > massive amount of code, and doesn't look like anything I'd call a > fast-path. Also, much of that loop has preemption enabled... > >> Are you ok with that approach? Assuming we don't completely botch things, the >> interfaces are sane, we can come up with a clean solution for handling NMIs, etc. > > Since you steal the whole PMU, can't you re-route the PMI to something > that's virt friendly too? > >>> It also means ::exclude_guest should actually work -- it often does not >>> today -- the IBS thing for example totally ignores it. >> >> Is that already an in-tree, or are you talking about Manali's proposed series to >> support virtualizing IBS? > > The IBS code as is, it totally ignores ::exclude_guest. Manali was going > to add some of it. But I'm not at all sure about the state of the other > PMU drivers we have. > > Just for giggles, P4 has VMX support... /me runs like crazy I am working on Solution 1.1 from the approach proposed in [*]. I will send V2 (for IBS virtualization series) based on it shortly. * https://lore.kernel.org/all/20230908133114.GK19320@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/T/#m7389910e577966c93a0b50fbaf9442be80dc730b - Manali