2016-08-10 20:43 GMT+08:00 Frederic Weisbecker <fweisbec@xxxxxxxxx>: > On Thu, Aug 04, 2016 at 05:51:20PM +0800, Wanpeng Li wrote: >> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> >> The dl task will be replenished after dl task timer fire and start a new >> period. It will be enqueued and to re-evaluate its dependency on the tick >> in order to restart it. However, if cpu is hot-unplug, irq_work_queue will >> splash since the target cpu is offline. >> >> As a result: >> >> WARNING: CPU: 2 PID: 0 at kernel/irq_work.c:69 irq_work_queue_on+0xad/0xe0 >> Call Trace: >> dump_stack+0x99/0xd0 >> __warn+0xd1/0xf0 >> warn_slowpath_null+0x1d/0x20 >> irq_work_queue_on+0xad/0xe0 >> tick_nohz_full_kick_cpu+0x44/0x50 >> tick_nohz_dep_set_cpu+0x74/0xb0 >> enqueue_task_dl+0x226/0x480 >> activate_task+0x5c/0xa0 >> dl_task_timer+0x19b/0x2c0 >> ? push_dl_task.part.31+0x190/0x190 >> >> This can be triggered by hot-unplug the full dynticks cpu which dl task >> is running on. >> >> Actually we don't need to restart the tick since the target cpu is offline >> and nothing need scheduler tick. This patch fix it by not intend to re-evaluate >> tick dependency if the cpu is offline. >> >> Cc: Ingo Molnar <mingo@xxxxxxxxxx> >> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx> >> Cc: Juri Lelli <juri.lelli@xxxxxxx> >> Cc: Luca Abeni <luca.abeni@xxxxxxxx> >> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >> --- >> kernel/sched/core.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> index 7f2cae4..43b494f 100644 >> --- a/kernel/sched/core.c >> +++ b/kernel/sched/core.c >> @@ -628,6 +628,9 @@ bool sched_can_stop_tick(struct rq *rq) >> { >> int fifo_nr_running; >> >> + if (unlikely(!rq->online)) >> + return true; >> + > > I see, the CPU is offline but the tasks haven't been migrated yet. > That said it seems that rollback is still possible at this stage. > > Somehow we may need to deal with it. Thanks for your review, Frederic. :) The rq lock is held to serialize concurrent cpu hot-plug and dl task enqueue path(sched_can_stop_tick() is called in this path), so I think there is no issue here. Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html