ons. 28. apr. 2021 kl. 17:36 skrev Vincent Guittot <vincent.guittot@xxxxxxxxxx>: > You can keep both fixes tags ACK > If the cfs_rq is already in the list list_add_leaf_cfs_rq() will exit > early but if it's not, we don't have to make sure that the whole > branch in the list Yeah, thats right. Calling list_add_leaf_cfs_rq once "too much" doesnt hurt after all. > In fact, we can break as soon as list_add_leaf_cfs_rq() and > cfs_rq_throttled() return true ACK, that makes sense. > When a cfs_rq is throttled, it is not accounted in its parent anymore > so we don't have to update and propagate the load down. Okay. Still need to wrap my head around this a bit more I guess. I have looked a bit around, and there is actually a similar issue as "this one" for the case when a throttled cgroup is "moved" via cpuset. It is however waaay harder to reproduce, but it is doable, and it _will_ happen in real life systems if the timing is "correct". I will dig deeper and finish the patch for that case some time next week (hopefully). I think that however deserve a separate patchset, so I will come back with that later. > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 33b1ee31ae0f..18441ce7316c 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -11026,10 +11026,10 @@ static void propagate_entity_cfs_rq(struct sched_entity *se) > for_each_sched_entity(se) { > cfs_rq = cfs_rq_of(se); > > - if (cfs_rq_throttled(cfs_rq)) > - break; > + if (!cfs_rq_throttled(cfs_rq)) > + update_load_avg(cfs_rq, se, UPDATE_TG); > > - update_load_avg(cfs_rq, se, UPDATE_TG); > + list_add_leaf_cfs_rq(cfs_rq); > } > } Sent a v2 with something like this now; that exit if (list_add_leaf_cfs_rq(cfs_rq) && throttled). Since this loop start at the parent of the cfs_rq of the supplied se, I added a list_add_leaf_cfs_rq to the top in order to insert the leaf cfs_rq as well. Thanks Odin