On Fri, Jul 12, 2024, Steven Rostedt wrote: > On Fri, 12 Jul 2024 09:44:16 -0700 > Sean Christopherson <seanjc@xxxxxxxxxx> wrote: > > > > All we need is a notifier that gets called at every VMEXIT. > > > > Why? The only argument I've seen for needing to hook VM-Exit is so that the > > host can speculatively boost the priority of the vCPU when deliverying an IRQ, > > but (a) I'm unconvinced that is necessary, i.e. that the vCPU needs to be boosted > > _before_ the guest IRQ handler is invoked and (b) it has almost no benefit on > > modern hardware that supports posted interrupts and IPI virtualization, i.e. for > > which there will be no VM-Exit. > > No. The speculatively boost was for something else, but slightly > related. I guess the ideal there was to have the interrupt coming in > boost the vCPU because the interrupt could be waking an RT task. It may > still be something needed, but that's not what I'm talking about here. > > The idea here is when an RT task is scheduled in on the guest, we want > to lazily boost it. As long as the vCPU is running on the CPU, we do > not need to do anything. If the RT task is scheduled for a very short > time, it should not need to call any hypercall. It would set the shared > memory to the new priority when the RT task is scheduled, and then put > back the lower priority when it is scheduled out and a SCHED_OTHER task > is scheduled in. > > Now if the vCPU gets preempted, it is this moment that we need the host > kernel to look at the current priority of the task thread running on > the vCPU. If it is an RT task, we need to boost the vCPU to that > priority, so that a lower priority host thread does not interrupt it. I got all that, but I still don't see any need to hook VM-Exit. If the vCPU gets preempted, the host scheduler is already getting "notified", otherwise the vCPU would still be scheduled in, i.e. wouldn't have been preempted. > The host should also set a bit in the shared memory to tell the guest > that it was boosted. Then when the vCPU schedules a lower priority task > than what is in shared memory, and the bit is set that tells the guest > the host boosted the vCPU, it needs to make a hypercall to tell the > host that it can lower its priority again. Which again doesn't _need_ a dedicated/manual VM-Exit. E.g. why force the host to reasses the priority instead of simply waiting until the next reschedule? If the host is running tickless, then presumably there is a scheduling entity running on a different pCPU, i.e. that can react to vCPU priority changes without needing a VM-Exit.