Re: [RFC PATCH 0/3] kvm,sched: Add gtime halted

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 26, 2025, Fernand Sieber wrote:
> On Tue, 2025-02-25 at 18:17 -0800, Sean Christopherson wrote:
> > > In this RFC we introduce the concept of guest halted time to address
> > > these concerns. Guest halted time (gtime_halted) accounts for cycles
> > > spent in guest mode while the cpu is halted. gtime_halted relies on
> > > measuring the mperf msr register (x86) around VM enter/exits to compute
> > > the number of unhalted cycles; halted cycles are then derived from the
> > > tsc difference minus the mperf difference.
> > 
> > IMO, there are better ways to solve this than having KVM sample MPERF on
> > every entry and exit.
> > 
> > The kernel already samples APERF/MPREF on every tick and provides that
> > information via /proc/cpuinfo, just use that.  If your userspace is unable
> > to use /proc/cpuinfo or similar, that needs to be explained.
> 
> If I understand correctly what you are suggesting is to have userspace
> regularly sampling these values to detect the most idle CPUs and then
> use CPU affinity to repin housekeeping tasks to these. While it's
> possible this essentially requires to implement another scheduling
> layer in userspace through constant re-pinning of tasks. This also
> requires to constantly identify the full set of tasks that can induce
> undesirable overhead so that they can be pinned accordingly. For these
> reasons we would rather want the logic to be implemented directly in
> the scheduler.
> 
> > And if you're running vCPUs on tickless CPUs, and you're doing HLT/MWAIT
> > passthrough, *and* you want to schedule other tasks on those CPUs, then IMO
> > you're abusing all of those things and it's not KVM's problem to solve,
> > especially now that sched_ext is a thing.
> 
> We are running vCPUs with ticks, the rest of your observations are
> correct.

If there's a host tick, why do you need KVM's help to make scheduling decisions?
It sounds like what you want is a scheduler that is primarily driven by MPERF
(and APERF?), and sched_tick() => arch_scale_freq_tick() already knows about MPERF.





[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux