在 2023/3/5 12:02, Sasha Levin 写道: > This is a note to let you know that I've just added the patch titled > > sched/fair: sanitize vruntime of entity being placed > > to the 4.14-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary > > The filename of the patch is: > sched-fair-sanitize-vruntime-of-entity-being-placed.patch > and it can be found in the queue-4.14 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable@xxxxxxxxxxxxxxx> know about it. > > > > commit 38247e1de3305a6ef644404ac818bc6129440eae Hi, This patch has significant impact on the hackbench.throughput [1]. Please don't backport this patch. [1] https://lore.kernel.org/lkml/202302211553.9738f304-yujie.liu@xxxxxxxxx/T/#u Thanks. Zhang Qiao. > Author: Zhang Qiao <zhangqiao22@xxxxxxxxxx> > Date: Mon Jan 30 13:22:16 2023 +0100 > > sched/fair: sanitize vruntime of entity being placed > > [ Upstream commit 829c1651e9c4a6f78398d3e67651cef9bb6b42cc ] > > When a scheduling entity is placed onto cfs_rq, its vruntime is pulled > to the base level (around cfs_rq->min_vruntime), so that the entity > doesn't gain extra boost when placed backwards. > > However, if the entity being placed wasn't executed for a long time, its > vruntime may get too far behind (e.g. while cfs_rq was executing a > low-weight hog), which can inverse the vruntime comparison due to s64 > overflow. This results in the entity being placed with its original > vruntime way forwards, so that it will effectively never get to the cpu. > > To prevent that, ignore the vruntime of the entity being placed if it > didn't execute for much longer than the characteristic sheduler time > scale. > > [rkagan: formatted, adjusted commit log, comments, cutoff value] > Signed-off-by: Zhang Qiao <zhangqiao22@xxxxxxxxxx> > Co-developed-by: Roman Kagan <rkagan@xxxxxxxxx> > Signed-off-by: Roman Kagan <rkagan@xxxxxxxxx> > Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx> > Link: https://lkml.kernel.org/r/20230130122216.3555094-1-rkagan@xxxxxxxxx > Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 3ff60230710c9..afa21e43477fa 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3615,6 +3615,7 @@ static void > place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > { > u64 vruntime = cfs_rq->min_vruntime; > + u64 sleep_time; > > /* > * The 'current' period is already promised to the current tasks, > @@ -3639,8 +3640,18 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial) > vruntime -= thresh; > } > > - /* ensure we never gain time by being placed backwards. */ > - se->vruntime = max_vruntime(se->vruntime, vruntime); > + /* > + * Pull vruntime of the entity being placed to the base level of > + * cfs_rq, to prevent boosting it if placed backwards. If the entity > + * slept for a long time, don't even try to compare its vruntime with > + * the base as it may be too far off and the comparison may get > + * inversed due to s64 overflow. > + */ > + sleep_time = rq_clock_task(rq_of(cfs_rq)) - se->exec_start; > + if ((s64)sleep_time > 60LL * NSEC_PER_SEC) > + se->vruntime = vruntime; > + else > + se->vruntime = max_vruntime(se->vruntime, vruntime); > } > > static void check_enqueue_throttle(struct cfs_rq *cfs_rq); > . >