On Wed, May 27, 2009 at 09:57:19AM +0800, Lai Jiangshan wrote: > Paul E. McKenney wrote: > > > > I am concerned about the following sequence of events: > > > > o synchronize_sched_expedited() disables preemption, thus blocking > > offlining operations. > > > > o CPU 1 starts offlining CPU 0. It acquires the CPU-hotplug lock, > > and proceeds, and is now waiting for preemption to be enabled. > > > > o synchronize_sched_expedited() disables preemption, sees > > that CPU 0 is online, so initializes and queues a request, > > does a wake-up-process(), and finally does a preempt_enable(). > > > > o CPU 0 is currently running a high-priority real-time process, > > so the wakeup does not immediately happen. > > > > o The offlining process completes, including the kthread_stop() > > to the migration task. > > > > o The migration task wakes up, sees kthread_should_stop(), > > and so exits without checking its queue. > > > > o synchronize_sched_expedited() waits forever for CPU 0 to respond. > > > > I suppose that one way to handle this would be to check for the CPU > > going offline before doing the wait_for_completion(), but I am concerned > > about races affecting this check as well. > > > > Or is there something in the CPU-offline process that makes the above > > sequence of events impossible? > > > > Thanx, Paul > > > > > > I realized this, I wrote this: > > > > The coupling of synchronize_sched_expedited() and migration_req > > is largely increased: > > > > 1) The offline cpu's per_cpu(rcu_migration_req, cpu) is handled. > > See migration_call::CPU_DEAD > > synchronize_sched_expedited() will not wait for CPU#0, because > migration_call()::case CPU_DEAD wakes up the requestors. > > migration_call() > { > ... > case CPU_DEAD: > case CPU_DEAD_FROZEN: > ... > /* > * No need to migrate the tasks: it was best-effort if > * they didn't take sched_hotcpu_mutex. Just wake up > * the requestors. > */ > spin_lock_irq(&rq->lock); > while (!list_empty(&rq->migration_queue)) { > struct migration_req *req; > > req = list_entry(rq->migration_queue.next, > struct migration_req, list); > list_del_init(&req->list); > spin_unlock_irq(&rq->lock); > complete(&req->done); > spin_lock_irq(&rq->lock); > } > spin_unlock_irq(&rq->lock); > ... > ... > } > > My approach depend on the requestors are waked up at any case. > migration_call() does it for us but the coupling is largely > increased. OK, good point! I do need to think about this. In the meantime, where do you see a need to run synchronize_sched_expedited() from within a hotplug CPU notifier? Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html