On Fri, Sep 02, 2016 at 04:29:19PM -0400, Alan Stern wrote: > On Fri, 2 Sep 2016, Paul E. McKenney wrote: > > > On Fri, Sep 02, 2016 at 02:10:13PM -0400, Alan Stern wrote: > > > Paul, Peter, and Ingo: > > > > > > This must have come up before, but I don't know what was decided. > > > > > > Isn't it often true that a memory barrier is needed before a call to > > > wake_up_process()? A typical scenario might look like this: > > > > > > CPU 0 > > > ----- > > > for (;;) { > > > set_current_state(TASK_INTERRUPTIBLE); > > > if (signal_pending(current)) > > > break; > > > if (wakeup_flag) > > > break; > > > schedule(); > > > } > > > __set_current_state(TASK_RUNNING); > > > wakeup_flag = 0; > > > > > > > > > CPU 1 > > > ----- > > > wakeup_flag = 1; > > > wake_up_process(my_task); > > > > > > The underlying pattern is: > > > > > > CPU 0 CPU 1 > > > ----- ----- > > > write current->state write wakeup_flag > > > smp_mb(); > > > read wakeup_flag read my_task->state > > > > > > where set_current_state() does the write to current->state and > > > automatically adds the smp_mb(), and wake_up_process() reads > > > my_task->state to see whether the task needs to be woken up. > > > > > > The kerneldoc for wake_up_process() says that it has no implied memory > > > barrier if it doesn't actually wake anything up. And even when it > > > does, the implied barrier is only smp_wmb, not smp_mb. > > > > > > This is the so-called SB (Store Buffer) pattern, which is well known to > > > require a full smp_mb on both sides. Since wake_up_process() doesn't > > > include smp_mb(), isn't it correct that the caller must add it > > > explicitly? > > > > > > In other words, shouldn't the code for CPU 1 really be: > > > > > > wakeup_flag = 1; > > > smp_mb(); > > > wake_up_process(task); > > > > > > If my reasoning is correct, then why doesn't wake_up_process() include > > > this memory barrier automatically, the way set_current_state() does? > > > There could be an alternate version (__wake_up_process()) which omits > > > the barrier, just like __set_current_state(). > > > > A common case uses locking, in which case additional memory barriers > > inside of the wait/wakeup functions are not needed. Any accesses made > > while holding the lock before invoking the wakeup function (e.g., > > wake_up()) are guaranteed to be seen after acquiring that same > > lock following return from the wait function (e.g., wait_event()). > > In this case, adding barriers to the wait and wakeup functions would > > just add overhead. > > > > But yes, this decision does mean that people using the wait/wakeup > > functions without locking need to be more careful. Something like > > this: > > > > /* prior accesses. */ > > smp_mb(); > > wakeup_flag = 1; > > wake_up(...); > > > > And on the other task: > > > > wait_event(... wakeup_flag == 1 ...); > > smp_mb(); > > /* The waker's prior accesses will be visible here. */ > > > > Or am I missing your point? > > I'm afraid so. The code doesn't use wait_event(), in part because > there's no wait_queue (since only one task is involved). Ah, got it. The required pattern should be very similar, however. > But maybe there's another barrier which needs to be fixed. Felipe, can > you check to see if received_cbw() is getting called in > get_next_command(), and if so, what value it returns? Or is the > preceding sleep_thread() the one that never wakes up? > > It could be that the smp_wmb() in wakeup_thread() needs to be smp_mb(). > The reason being that get_next_command() runs outside the protection of > the spinlock. This sounds very likely to me. Thanx, Paul -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html