On Thu, Feb 27, 2025, Fernand Sieber wrote: > On Wed, 2025-02-26 at 13:00 -0800, Sean Christopherson wrote: > > On Wed, Feb 26, 2025, Fernand Sieber wrote: > > > On Tue, 2025-02-25 at 18:17 -0800, Sean Christopherson wrote: > > > > And if you're running vCPUs on tickless CPUs, and you're doing > > > > HLT/MWAIT passthrough, *and* you want to schedule other tasks on those > > > > CPUs, then IMO you're abusing all of those things and it's not KVM's > > > > problem to solve, especially now that sched_ext is a thing. > > > > > > We are running vCPUs with ticks, the rest of your observations are > > > correct. > > > > If there's a host tick, why do you need KVM's help to make scheduling > > decisions? It sounds like what you want is a scheduler that is primarily > > driven by MPERF (and APERF?), and sched_tick() => arch_scale_freq_tick() > > already knows about MPERF. > > Having the measure around VM enter/exit makes it easy to attribute the > unhalted cycles to a specific task (vCPU), which solves both our use > cases of VM metrics and scheduling. That said we may be able to avoid > it and achieve the same results. > > i.e > * the VM metrics use case can be solved by using /proc/cpuinfo from > userspace. > * for the scheduling use case, the tick based sampling of MPERF means > we could potentially introduce a correcting factor on PELT accounting > of pinned vCPU tasks based on its value (similar to what I do in the > last patch of the series). > > The combination of these would remove the requirement of adding any > logic around VM entrer/exit to support our use cases. > > I'm happy to prototype that if we think it's going in the right > direction? That's mostly a question for the scheduler folks. That said, from a KVM perspective, sampling MPERF around entry/exit for scheduling purposes is a non-starter.