On Mon, Apr 10, 2017 at 01:48:13PM -0400, Alan Stern wrote: > On Mon, 10 Apr 2017, Paul E. McKenney wrote: > > > On Mon, Apr 10, 2017 at 12:20:53PM -0400, Alan Stern wrote: > > > On Mon, 10 Apr 2017, Paul E. McKenney wrote: > > > > > > > > But I would like to get this matter settled first. Is the explicit > > > > > barrier truly necessary? > > > > > > > > If you are using wait_event()/wake_up() or friends, the explicit > > > > barrier -is- necessary. To see this, look at v4.10's wait_event(): > > > > > > > > #define wait_event(wq, condition) \ > > > > do { \ > > > > might_sleep(); \ > > > > if (condition) \ > > > > break; \ > > > > __wait_event(wq, condition); \ > > > > } while (0) > > > > > > > > As you can see, if the condition is set just before the wait_event() > > > > macro checks it, there is no ordering whatsoever. > > > > > > This is true, but it is not relevant to the question I was asking. > > > > Apologies! What I get for answering email too early on Monday, I guess... > > > > > > And if wake_up() > > > > finds nothing to wake up, there is no relevant ordering on that side, > > > > either. > > > > > > > > So you had better supply your own ordering, period, end of story. > > > > > > The question is: Exactly what ordering do I need to supply? The > > > ordering among my own variables is okay; I know how to deal with that. > > > But what about the ordering between my variables and current->state? > > > > The ordering with current->state is sadly not relevant because it is > > only touched if wake_up() actually wakes the process up. > > Well, it is _written_ only if wake_up() actually wakes the process up. > But it is _read_ in every case. For wake_up_process(), agreed. But for wake_up(), if the process doing the wait_event() saw the changed state on the very first check, the waking process won't have any way of gaining a reference to the "awakened" task, so cannot access its ->state. > > > For example, __wait_event() calls prepare_to_wait(), which calls > > > set_current_state(), which calls smp_store_mb(), thereby inserting a > > > full memory barrier between setting current->state and checking the > > > condition. But I didn't see any comparable barrier inserted by > > > wake_up(), between setting the condition and checking task->state. > > > > > > However, now that I look more closely, I do see that wakeup_process() > > > calls try_to_wake_up(), which begins with: > > > > > > /* > > > * If we are going to wake up a thread waiting for CONDITION we > > > * need to ensure that CONDITION=1 done by the caller can not be > > > * reordered with p->state check below. This pairs with mb() in > > > * set_current_state() the waiting thread does. > > > */ > > > smp_mb__before_spinlock(); > > > raw_spin_lock_irqsave(&p->pi_lock, flags); > > > if (!(p->state & state)) > > > > > > So it does insert a full barrier after all, and there is nothing to > > > worry about. > > > > Nice! > > It looks like the other wakeup pathways end up funnelling through > try_to_wake_up(), so this is true in general. Only for wake_up_process() and friends, not for wake_up() and friends. Again, although wake_up_process() unconditionally checks the awakened processm, wake_up() doesn't even have any way of knowing what process it woke up in the case where the "awakened" process didn't actually sleep. But even in the wake_up_process() case, don't we need the wait-side process to have appropriate barriers (or locks or whatever) manually inserted? > > Hmmm... > > > > Another valid (and I believe more common) idiom is this: > > > > spin_lock(&mylock); > > changes_that_must_be_visible_to_woken_thread(); > > WRITE_ONCE(need_wake_up, true); > > spin_unlock(&mylock); > > > > --- > > > > wait_event(wq, READ_ONCE(need_wake_up)); > > spin_lock(&mylock); > > access_variables_used_by_waking_thread(); > > spin_unlock(&mylock); > > > > In this case, the locks do all the required ordering. > > > > > This also means that the analysis provided by Thinh Nguyen in the > > > original patch description is wrong. > > > > And that the bug is elsewhere? > > Presumably. On the other hand, Thinh Nguyen claimed to have narrowed > the problem down to this particular mechanism. The driver in question > in drivers/usb/gadget/function/f_mass_storage.c. The waker routines > are bulk_out_complete()/wakeup_thread(), which do: > > // bulk_out_complete() > bh->state = BH_STATE_FULL; > > // wakeup_thread() > smp_wmb(); /* ensure the write of bh->state is complete */ > /* Tell the main thread that something has happened */ > common->thread_wakeup_needed = 1; > if (common->thread_task) > wake_up_process(common->thread_task); > > and the waiters are get_next_command()/sleep_thread(), which do: > > // get_next_command() > while (bh->state == BH_STATE_EMPTY) { > > // sleep_thread() > for (;;) { > if (can_freeze) > try_to_freeze(); > set_current_state(TASK_INTERRUPTIBLE); > if (signal_pending(current)) { > rc = -EINTR; > break; > } > if (common->thread_wakeup_needed) > break; > schedule(); > } > __set_current_state(TASK_RUNNING); > common->thread_wakeup_needed = 0; > smp_rmb(); /* ensure the latest bh->state is visible */ > > } > > and he said that the problem was caused by the waiter seeing > thread_wakeup_needed == 0, so the wakeup was getting lost. > > Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1 > could race with the waiter's thread_wakeup_needed = 0. If there are > two waits in quick succession, the second could get lost. The pattern > would be: > > bh->state = BH_STATE_FULL; > smp_wmb(); > thread_wakeup_needed = 0; thread_wakeup_needed = 1; > smp_rmb(); > if (bh->state != BH_STATE_FULL) > sleep again... > > This is the so-called R pattern, and it also needs full memory barriers > on both sides. The barriers we have are not sufficient. (This is an > indication that the driver's design needs to be re-thought.) As it is, > the waiter's thread_wakeup_needed = 0 can overwrite the waker's > thread_wakeup_needed = 1 while the waiter's read of bh->state then > fails to see the waker's write. (This analysis is similar to but > different from the one in the patch description.) > > To fix this problem, both the smp_rmb() in sleep_thread() and the > smp_wmb() in wakeup_thread() should be changed to smp_mb(). Good catch! However, if this failure was seen on x86, there is something else going on as well. Thanx, Paul > Felipe, was this patch meant to solve the problem you encountered in > your "Memory barrier needed with wake_up_process()?" email thread last > fall? > > Alan Stern > -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html