On Fri, 2 Sep 2016, Paul E. McKenney wrote: > On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote: > > Paul, Peter, and Ingo: > > > > This must have come up before, but I don't know what was decided. > > > > Isn't it often true that a memory barrier is needed before a call to > > wake_up_process()? A typical scenario might look like this: > > > > CPU 0 > > ----- > > for (;;) { > > set_current_state(TASK_INTERRUPTIBLE); > > if (signal_pending(current)) > > break; > > if (wakeup_flag) > > break; > > schedule(); > > } > > __set_current_state(TASK_RUNNING); > > wakeup_flag = 0; > > > > > > CPU 1 > > ----- > > wakeup_flag = 1; > > wake_up_process(my_task); > > > > The underlying pattern is: > > > > CPU 0 CPU 1 > > ----- ----- > > write current->state write wakeup_flag > > smp_mb(); > > read wakeup_flag read my_task->state > > > > where set_current_state() does the write to current->state and > > automatically adds the smp_mb(), and wake_up_process() reads > > my_task->state to see whether the task needs to be woken up. > > > > The kerneldoc for wake_up_process() says that it has no implied memory > > barrier if it doesn't actually wake anything up. And even when it > > does, the implied barrier is only smp_wmb, not smp_mb. > > > > This is the so-called SB (Store Buffer) pattern, which is well known to > > require a full smp_mb on both sides. Since wake_up_process() doesn't > > include smp_mb(), isn't it correct that the caller must add it > > explicitly? > > > > In other words, shouldn't the code for CPU 1 really be: > > > > wakeup_flag = 1; > > smp_mb(); > > wake_up_process(task); > > > > If my reasoning is correct, then why doesn't wake_up_process() include > > this memory barrier automatically, the way set_current_state() does? > > There could be an alternate version (__wake_up_process()) which omits > > the barrier, just like __set_current_state(). > > A common case uses locking, in which case additional memory barriers > inside of the wait/wakeup functions are not needed. Any accesses made > while holding the lock before invoking the wakeup function (e.g., > wake_up()) are guaranteed to be seen after acquiring that same > lock following return from the wait function (e.g., wait_event()). > In this case, adding barriers to the wait and wakeup functions would > just add overhead. > > But yes, this decision does mean that people using the wait/wakeup > functions without locking need to be more careful. Something like > this: > > /* prior accesses. */ > smp_mb(); > wakeup_flag = 1; > wake_up(...); > > And on the other task: > > wait_event(... wakeup_flag == 1 ...); > smp_mb(); > /* The waker's prior accesses will be visible here. */ > > Or am I missing your point? I'm afraid so. The code doesn't use wait_event(), in part because there's no wait_queue (since only one task is involved). But maybe there's another barrier which needs to be fixed. Felipe, can you check to see if received_cbw() is getting called in get_next_command(), and if so, what value it returns? Or is the preceding sleep_thread() the one that never wakes up? It could be that the smp_wmb() in wakeup_thread() needs to be smp_mb(). The reason being that get_next_command() runs outside the protection of the spinlock. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html