Hi Vincent, On 2018-04-26 17:27:24 +0200, Vincent Guittot wrote: > Hi Niklas, > > >> Thanks for the trace, I have been able to catch a problem with it. > >> Could you test the patch below to confirm that the problem is solved ? > >> The patch apply on-top of > >> c18bb396d3d261eb ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net") > > > > I can confirm that with the patch bellow I can no longer produce the > > problem. Thanks! > > Thanks for testing > Do you mind if I add > Tested-by: Niklas Söderlund <niklas.soderlund@xxxxxxxxxxxx> Please do. > > Peter, Ingo, > Do you want me to re-send the patch with all tags or you will take > this version ? > > Regards, > Vincent > > > > >> > >> From: Vincent Guittot <vincent.guittot@xxxxxxxxxx> > >> Date: Thu, 26 Apr 2018 12:19:32 +0200 > >> Subject: [PATCH] sched/fair: fix the update of blocked load when newly idle > >> MIME-Version: 1.0 > >> Content-Type: text/plain; charset=UTF-8 > >> Content-Transfer-Encoding: 8bit > >> > >> With commit 31e77c93e432 ("sched/fair: Update blocked load when newly idle"), > >> we release the rq->lock when updating blocked load of idle CPUs. This open > >> a time window during which another CPU can add a task to this CPU's cfs_rq. > >> The check for newly added task of idle_balance() is not in the common path. > >> Move the out label to include this check. > >> > >> Fixes: 31e77c93e432 ("sched/fair: Update blocked load when newly idle") > >> Reported-by: Heiner Kallweit <hkallweit1@xxxxxxxxx> > >> Reported-by: Niklas Söderlund <niklas.soderlund@xxxxxxxxxxxx> > >> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx> > >> --- > >> kernel/sched/fair.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > >> index 0951d1c..15a9f5e 100644 > >> --- a/kernel/sched/fair.c > >> +++ b/kernel/sched/fair.c > >> @@ -9847,6 +9847,7 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) > >> if (curr_cost > this_rq->max_idle_balance_cost) > >> this_rq->max_idle_balance_cost = curr_cost; > >> > >> +out: > >> /* > >> * While browsing the domains, we released the rq lock, a task could > >> * have been enqueued in the meantime. Since we're not going idle, > >> @@ -9855,7 +9856,6 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf) > >> if (this_rq->cfs.h_nr_running && !pulled_task) > >> pulled_task = 1; > >> > >> -out: > >> /* Move the next balance forward */ > >> if (time_after(this_rq->next_balance, next_balance)) > >> this_rq->next_balance = next_balance; > >> -- > >> 2.7.4 > >> > >> > >> > >> [snip] > >> > > > > -- > > Regards, > > Niklas Söderlund -- Regards, Niklas Söderlund