On Tue, 2024-10-22 at 12:00 -0700, Sean Christopherson wrote: > On Mon, Oct 21, 2024, Sean Christopherson wrote: > > On Wed, Oct 04, 2023, Maxim Levitsky wrote: > > > About the added 'vcpu->loaded' variable, I added it also because it is > > > something that is long overdue to be added, I remember that in IPIv code > > > there was also a need for this, and probalby more places in KVM can be > > > refactored to take advantage of it, instead of various hacks. > > > > I don't view using the information from the Physical ID table as a hack. It very > > explicitly uses the ir_list_lock to ensure that the pCPU that's programmed into > > the IRTE is the pCPU on which the vCPU is loaded, and provides rather strict > > ordering between task migration and device assignment. It's not a super hot path, > > so I don't think lockless programming is justified. If you strongly prefer this I won't argue. KVM does read back its SPTE entries, which is also something I can't say that I like that much. > > > > I also think we should keep IsRunning=1 when the vCPU is unloaded. That approach > > won't run afoul of your concern with signaling the wrong pCPU, because KVM can > > still keep the ID up-to-date, e.g. if the task is migrated when a pCPU is being > > offlined. > > > > The motiviation for keeping IsRunning=1 is to avoid unnecessary VM-Exits and GA > > log IRQs. E.g. if a vCPU exits to userspace, there's zero reason to force IPI > > senders to exit, because KVM can't/won't notify userspace, and the pending virtual > > interrupt will be processed on the next VMRUN. > > My only hesitation to keeping IsRunning=1 is that there could, in theory, be a > noisy neighbor problem. E.g. if there is meaningful overhead when the CPU responds > to the doorbell. I once measured this by bombarding a regular CPU, which is not running any guests, with AVIC doorbells. It was like 60% reduction of its performance if I remember correctly. So physical id table entries of a VM can't point to a CPU which doesn't run the VM's vCPU thread, because only in this case this doesn't pose a DOS risk. Same with IOMMU (malicious guest can in theory make an assigned device generate an interrupt storm, and then this storm can get redirected to a doorbell of a CPU which doesn't belong to a VM). Best regards, Maxim Levitsky > Hrm, and if another vCPU is scheduled in on the same pCPU, that > vCPU could end up processing a virtual interrupt in response to a doorbell intended > for a different vCPU. > > The counter-argument to both concerns is that APICv Posted Interrupts have had a > _worse_ version of that behavior for years, and no one has complained. KVM sets > PID.SN only when a vCPU is _preempted_, and so devices (and now virtual IPIs) will > send notification IRQs to pCPUs that aren't actively running the vCPU, or are > running a different vCPU. > > The counter-counter-argument is that (a) IPI virtualization is a recent addition, > and device posted interrupts are unlikely to be used in a CPU oversubscribed setup, > and (b) Posted Interrupts are effectively rate-limited to a single "spurious" > notification per vCPU, as notification IRQs are sent if and only if PID.ON=0. > > That said, while I'm somewhat less confident that keeping IsRunning=1 is desirable > for all use cases than I was yesterday, I still think we should avoid tightly > coupling it to whether or not the vCPU is loaded, because there are undoubtedly > setups where it _is_ desirable, e.g. if vCPUs are pinned 1:1 to pCPUs. >