Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 10 Apr 2017, Felipe Balbi wrote:

> Hi,
> 
> Thinh Nguyen <Thinh.Nguyen@xxxxxxxxxxxx> writes:
> > f_mass_storage has a memorry barrier issue with the sleep and wake
> > functions that can cause a deadlock. This results in intermittent hangs
> > during MSC file transfer. The host will reset the device after receiving
> > no response to resume the transfer. This issue is seen when dwc3 is
> > processing 2 transfer-in-progress events at the same time, invoking
> > completion handlers for CSW and CBW. Also this issue occurs depending on
> > the system timing and latency.
> >
> > To increase the chance to hit this issue, you can force dwc3 driver to
> > wait and process those 2 events at once by adding a small delay (~100us)
> > in dwc3_check_event_buf() whenever the request is for CSW and read the
> > event count again. Avoid debugging with printk and ftrace as extra
> > delays and memory barrier will mask this issue.

> >
> > Scenario which can lead to failure:
> > -----------------------------------
> > 1) The main thread sleeps and waits for the next command.
> > 2) bulk_in_complete() wakes up main thread for CSW.
> > 3) bulk_out_complete() tries to wake up the running main thread for CBW.
> > 4) thread_wakeup_needed is not loaded with correct value in sleep_thread().
> > 5) Main thread goes to sleep.
> >
> > See more detail below:
> >
> > Note the 2 critical variables.
> >  * common->thread_wakeup_needed
> >  * bh->state
> >
> > Eval'd  | CPU0 (main thread)			   | CPU1
> > --------+------------------------------------------+----------------------
> >         | get_next_command()			   |
> >         | ...					   |
> > while(1)| while (bh->state != BUF_STATE_FULL) {	   |
> >         |  for (;;) {				   |
> >         |    set_current_state(TASK_INTERRUPTIBLE);|
> > if(0)   |    if (thread_wakeup_needed)		   |
> >         |    schedule()				   |
> > 	|					   | bulk_in_complete() {
> > 	|					   | smp_wmb();
> > 	|					   | bh->state = BUF_STATE_EMPTY
> > 	|					   | smp_wmb();
> > 	|					   | thread_wakeup_needed = 1
> > 	|					   | wake_up_process()
> > 	|					   | }
> > 	|  /* 2nd for loop iteration */		   |
> >         |  for (;;) {				   |
> >         |    set_current_state(TASK_INTERRUPTIBLE);|
> > if(1)   |    if (thread_wakeup_needed)		   |
> >         |      break				   |
> >         |  }					   |
> > 	|					   | bulk_out_complete() {
> > 	|					   | smp_wmb();
> > 	|  __set_current_state(TASK_RUNNING);	   |
> > 	|  thread_wakeup_needed = 0;		   |
> > 	|  smp_rmb();				   |
> > 	|					   |
> > 	|  /* 2nd while loop iteration */	   |
> > while(1)| while (bh->state != BUF_STATE_FULL) {	   |
> > 	|					   | bh->state = BUF_STATE_FULL;
> > 	|					   | smp_wmb();
> > 	|					-->| thread_wakeup_needed = 1
> > 	|					   | wake_up_process()
> > 	|					   | }
> >         |  for (;;) {				   |
> >         |    set_current_state(TASK_INTERRUPTIBLE);|
> > if(0)   |    if (thread_wakeup_needed)		   |<--
> >         |    schedule()				   |
> >         |					   |
> > 	|	<THREAD GOES TO SLEEP AND WAITS>   |
> >
> > thread_wakeup_needed is not guaranteed to be loaded before checking it.
> > Note that this happens when wake_up_process() is called on a running
> > process.
> >
> > This patch fixes this issue by explicitly use smp_mb() after clearing of
> > thread_wakeup_needed in the sleep function. It serializes the order in
> > which thread_wakeup_needed is loaded and stored, thus prevent loading
> > the wrong value when checking it in the sleeper.
> >
> > However, a better solution in the future would be to use wait_queue
> > method that takes care of managing memory barrier between waker and
> > waiter.
> >
> > See DEFINE_WAIT_FUNC comment in kernel scheduling wait.c as this
> > solution is similar to its implementation.
> >
> > Signed-off-by: Thinh Nguyen <thinhn@xxxxxxxxxxxx>
> > ---
> >  drivers/usb/gadget/function/f_mass_storage.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/usb/gadget/function/f_mass_storage.c b/drivers/usb/gadget/function/f_mass_storage.c
> > index 4c8aacc232c0..f40433eb8259 100644
> > --- a/drivers/usb/gadget/function/f_mass_storage.c
> > +++ b/drivers/usb/gadget/function/f_mass_storage.c
> > @@ -627,7 +627,7 @@ static int sleep_thread(struct fsg_common *common, bool can_freeze)
> >  	}
> >  	__set_current_state(TASK_RUNNING);
> >  	common->thread_wakeup_needed = 0;
> > -	smp_rmb();	/* ensure the latest bh->state is visible */
> > +	smp_mb();	/* ensure the latest bh->state is visible */
> >  	return rc;
> >  }
> >  
> > -- 
> > 2.11.0
> 
> Alan, do you have any comments here? Do we really need a full barrier in
> this situation?

Peter Z has pointed out in the past that the locking and barrier
mechanisms in f_mass_storage are not well thought out.  He also
suggested using a wait_queue rather than doing some home-made
sleep/wake calculations.  I agree that this would simplify the code.

However, in this case I do not believe the bug described above will be 
fixed by changing the smp_rmb() to smp_mb() or by going to a 
wait_queue.  Here's my reasoning; maybe I'm wrong...

The important part of this race can be written schematically like so:

	CPU 0				CPU 1

A:	thread_wakeup_needed = 0;
	smp_rmb();
	...
B:					thread_wakeup_needed = 1;
C:					wake_up_process();

D:	set_current_state(TASK_INTERRUPTIBLE);
E:	if (thread_wakeup_needed)
		break;
	schedule();

Now, set_current_state() uses smp_store_mb(), so D is equivalent to

D:	current->state = TASK_INTERRUPTIBLE;
	smp_mb();

Furthermore, wake_up_process(task) must read task->state in order to
see whether task needs to be woken up.  HOWEVER, the kerneldoc for
wake_up_process() says:

 * It may be assumed that this function implies a write memory barrier before
 * changing the task state if and only if any tasks are woken up.

(Side point: This says _write_ memory barrier, not _full_ memory
barrier, which is what is needed here.)

So if CPU 0 was not yet in TASK_INTERRUPTIBLE when CPU 1 ran
wake_up_process(), then CPU 1 would not execute any barriers and we
would have:

	CPU 0				CPU 1
	D: Write task->state		B: Write thread_wakeup_needed
	   smp_mb()			C: Read task->state
	E: Read thread_wakeup_needed

with no memory barrier on CPU 1.  In memory model circles this is known
as the SB (Store Buffering) pattern, and without a full barrier on both
sides it is entirely possible for both reads to miss seeing the other
CPU's write.  That's what causes CPU 0 to call schedule() with nobody
to wake it up again: CPU 0 sees that thread_wakeup_needed is still 0, 
and CPU 1 sees that task->state is still TASK_RUNNING.

Please note that this analysis remains unchanged if the smp_rmb() after 
A is changed to smp_mb().  The problem is not related to the 
assignment at A; it is related to the assignment at B.  We need to have 
a memory barrier between B and C regardless of whether the task was 
woken up.

In other words, I believe the correct fix is to add smp_mb() in 
f_mass_storage's wakeup_thread() routine, just after 
thread_wakeup_needed is set to 1.

Before today, I had believed that wake_up_process() implicitly
performed this memory barrier in all cases.  But the kerneldoc says it
doesn't, and unless Peter tells us that the kerneldoc is wrong, I see
no alternative to adding the memory barrier explicitly.

Incidentally, going to a wait_queue implementation wouldn't help.  With 
wait_queues, the pattern for CPU 1 would be:

	thread_wakeup_needed = 1;
	wake_up(&wq);

But wake_up() is implemented as a call to __wake_up(), and the 
kerneldoc for __wake_up() says:

 * It may be assumed that this function implies a write memory barrier before
 * changing the task state if and only if any tasks are woken up.

Just as in wake_up_process() -- and again, what we need is an 
unconditional full barrier, not a conditional write barrier.  See also 
the kerneldoc for waitqueue_active() in include/linux/wait.h; it states 
that an explicit smp_mb() is required on the waker.

It's hard to believe this sort of problem hasn't bitten other people in 
the past.  Is there a better solution than to add a memory barrier 
every place where wake_up() (or one of its siblings) is called?

Alan Stern

P.S.: This kind of solution is okay for right away and something to put 
in -stable.  For the longer term, however, it has hardened my resolve 
to completely rewrite the locking and barriers in f_mass_storage.

But I would like to get this matter settled first.  Is the explicit 
barrier truly necessary?

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux