On Fri, 2 Sep 2016, Felipe Balbi wrote: > I'm thinking we might have a race WRT our use of memory barriers; could > it be? > > /me goes try > > Yeah, I've converted all smp_[wr]mb() to smp_mb() and the problem went > away (apparently). It was failing on first iteration of my test but it > has now been running for over 20 iterations without a problem. > > Now we just need to figure out which of the barriers is wrong. I just noticed that the kerneldoc for wake_up_process() says that the caller should assume a write memory barrier if and only if any tasks are woken up. So if the task is already running, there is no barrier. This means that in f_mass_storage's wakeup_thread(), there may be a race involving wake_up_process() reading the thread's state and the write to common->thread_wakeup_needed. wakeup_thread(): common->thread_wakeup_needed = 1; wake_up_process(common->thread_task); /* reads the thread's state */ sleep_thread(): set_current_state(TASK_INTERRUPTIBLE); if (common->thread_wakeup_needed) break; Now, set_current_state() implicitly has a memory barrier at the end. But since wake_up_process() doesn't always add a barrier the start, we may need to do this explicitly. Without that barrier, the CPU might execute the read in wake_up_process() before setting thread_wakeup_needed to 1. Then sleep_thread() would slip through the crack, and the thread wouldn't run. You can try adding smp_mb() in wakeup_thread(), just before the call to wake_up_process(). Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html