On Thu, Feb 22, 2024 at 1:51 AM Vincent Guittot <vincent.guittot@xxxxxxxxxx> wrote: > > On Tue, 20 Feb 2024 at 07:16, zhaoyang.huang <zhaoyang.huang@xxxxxxxxxx> wrote: > > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > > > As RT, DL, IRQ time could be deemed as lost time of CFS's task, some > > It's lost only if cfs has been actually preempted Yes. Actually, I just want to get the approximate proportion of how CFS tasks(whole runq) is preempted. The preemption among CFS is not considered. > > > timing value want to know the distribution of how these spread > > approximately by using utilization account value (nivcsw is not enough > > sometimes). This commit would like to introduce a helper function to > > achieve this goal. > > > > eg. > > Effective part of A = Total_time * cpu_util_cfs / cpu_util > > > > Timing value A > > (should be a process last for several TICKs or statistics of a repeadted > > process) > > > > Timing start > > | > > | > > preempted by RT, DL or IRQ > > |\ > > | This period time is nonvoluntary CPU give up, need to know how long > > |/ > > preempted means that a cfs task stops running on the cpu and lets > another rt/dl task or an irq run on the cpu instead. We can't know > that. We know an average ratio of time spent in rt/dl and irq contexts > but not if the cpu was idle or running cfs task ok, will take idle into consideration and as explained above, preemption among cfs tasks is not considered on purpose > > > sched in again > > | > > | > > | > > Timing end > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx> > > --- > > include/linux/sched.h | 1 + > > kernel/sched/core.c | 20 ++++++++++++++++++++ > > 2 files changed, 21 insertions(+) > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 77f01ac385f7..99cf09c47f72 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -2318,6 +2318,7 @@ static inline bool owner_on_cpu(struct task_struct *owner) > > > > /* Returns effective CPU energy utilization, as seen by the scheduler */ > > unsigned long sched_cpu_util(int cpu); > > +unsigned long cfs_prop_by_util(struct task_struct *tsk, unsigned long val); > > #endif /* CONFIG_SMP */ > > > > #ifdef CONFIG_RSEQ > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 802551e0009b..217e2220fdc1 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -7494,6 +7494,26 @@ unsigned long sched_cpu_util(int cpu) > > { > > return effective_cpu_util(cpu, cpu_util_cfs(cpu), ENERGY_UTIL, NULL); > > } > > + > > +/* > > + * Calculate the approximate proportion of timing value consumed in cfs. > > + * The user must be aware of this is done by avg_util which is tracked by > > + * the geometric series as decaying the load by y^32 = 0.5 (unit is 1ms). > > + * That is, only the period last for at least several TICKs or the statistics > > + * of repeated timing value are suitable for this helper function. > > + */ > > +unsigned long cfs_prop_by_util(struct task_struct *tsk, unsigned long val) > > +{ > > + unsigned int cpu = task_cpu(tsk); > > + struct rq *rq = cpu_rq(cpu); > > + unsigned long util; > > + > > + if (tsk->sched_class != &fair_sched_class) > > + return val; > > + util = cpu_util_rt(rq) + cpu_util_cfs(cpu) + cpu_util_irq(rq) + cpu_util_dl(rq); > > This is not correct as irq is not on the same clock domain: look at > effective_cpu_util() > > You don't care about idle time ? ok, will check. thanks > > > + return min(val, cpu_util_cfs(cpu) * val / util); > > +} > > + > > #endif /* CONFIG_SMP */ > > > > /** > > -- > > 2.25.1 > >