Re: [PATCH 3/3] usb: gadget: f_mass_storage: Serialize wake and sleep execution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 4/10/2017 12:31 PM, Paul E. McKenney wrote:
On Mon, Apr 10, 2017 at 03:00:34PM -0400, Alan Stern wrote:
On Mon, 10 Apr 2017, Paul E. McKenney wrote:

The ordering with current->state is sadly not relevant because it is
only touched if wake_up() actually wakes the process up.

Well, it is _written_ only if wake_up() actually wakes the process up.
But it is _read_ in every case.

For wake_up_process(), agreed.  But for wake_up(), if the process
doing the wait_event() saw the changed state on the very first check,
the waking process won't have any way of gaining a reference to the
"awakened" task, so cannot access its ->state.

True.  But that would be okay, since the waiter has definitely seen the
changed state.  I was concerned about the possibility that there was no
wakeup and the waiter did _not_ see the changed state.  That's how you
get tasks staying asleep indefinitely when they should be running.

It looks like the other wakeup pathways end up funnelling through
try_to_wake_up(), so this is true in general.

Only for wake_up_process() and friends, not for wake_up() and friends.
Again, although wake_up_process() unconditionally checks the awakened
processm, wake_up() doesn't even have any way of knowing what process
it woke up in the case where the "awakened" process didn't actually sleep.

Like the above, this would be okay.

But even in the wake_up_process() case, don't we need the wait-side
process to have appropriate barriers (or locks or whatever) manually
inserted?

Only for accesses among the driver's own variables.  There's no need to
order the local variables against current->state; as we have seen,
that's all handled for us.

Fair enough.  However, the kernel documentation needs to consider
the driver's own variables.

This also means that the analysis provided by Thinh Nguyen in the
original patch description is wrong.

And that the bug is elsewhere?

Presumably.  On the other hand, Thinh Nguyen claimed to have narrowed
the problem down to this particular mechanism.  The driver in question
in drivers/usb/gadget/function/f_mass_storage.c.  The waker routines
are bulk_out_complete()/wakeup_thread(), which do:

	// bulk_out_complete()
	bh->state = BH_STATE_FULL;

	// wakeup_thread()
	smp_wmb();	/* ensure the write of bh->state is complete */
	/* Tell the main thread that something has happened */
	common->thread_wakeup_needed = 1;
	if (common->thread_task)
		wake_up_process(common->thread_task);

and the waiters are get_next_command()/sleep_thread(), which do:

// get_next_command()
while (bh->state == BH_STATE_EMPTY) {

	// sleep_thread()
	for (;;) {
		if (can_freeze)
			try_to_freeze();
		set_current_state(TASK_INTERRUPTIBLE);
		if (signal_pending(current)) {
			rc = -EINTR;
			break;
		}
		if (common->thread_wakeup_needed)
			break;
		schedule();
	}
	__set_current_state(TASK_RUNNING);
	common->thread_wakeup_needed = 0;
	smp_rmb();	/* ensure the latest bh->state is visible */

}

and he said that the problem was caused by the waiter seeing
thread_wakeup_needed == 0, so the wakeup was getting lost.

Hmmm, I suppose it's possible that the waker's thread_wakeup_needed = 1
could race with the waiter's thread_wakeup_needed = 0.  If there are
two waits in quick succession, the second could get lost.  The pattern
would be:

					bh->state = BH_STATE_FULL;
					smp_wmb();
	thread_wakeup_needed = 0;	thread_wakeup_needed = 1;
	smp_rmb();
	if (bh->state != BH_STATE_FULL)
		sleep again...

This is the so-called R pattern, and it also needs full memory barriers
on both sides.  The barriers we have are not sufficient.  (This is an
indication that the driver's design needs to be re-thought.)  As it is,
the waiter's thread_wakeup_needed = 0 can overwrite the waker's
thread_wakeup_needed = 1 while the waiter's read of bh->state then
fails to see the waker's write.  (This analysis is similar to but
different from the one in the patch description.)

To fix this problem, both the smp_rmb() in sleep_thread() and the
smp_wmb() in wakeup_thread() should be changed to smp_mb().

Wouldn't it be sufficient to have smp_mb() just either in the sleep_thread() or the wakeup_thread() in this case?

With just 1 smp_mb(), either bh->state or thread_wakeup_needed is set in time to break the loop depending on when the wakeup_thread() enters.

Case 1: smp_mb() in sleep_thread()
					bh->state = BH_STATE_FULL;
					smp_wmb();
	thread_wakeup_needed = 0;	thread_wakeup_needed = 1;
-->	smp_mb();
	if (bh->state != BH_STATE_FULL)
	//breaks out of loop from bh->state check

Case 2: smp_mb() in wakeup_thread()

	thread_wakeup_needed = 0;	
	smp_rmb();
	if (bh->state != BH_STATE_FULL)
 					bh->state = BH_STATE_FULL;
--> 					smp_mb();
					thread_wakeup_needed = 1;
					wake_up_process()
	set_current_state(intr) //smp_wb
	if (thread_wakeup_need)
	//breaks out of loop from thread_wakeup_needed check


The issue won't be seen when I tested for both cases with smp_mb() in either the wakeup_thread() or the sleep_thread(). I notice that in wait_queue (DEFINED_WAIT_FUNC()), it places the smp_mb() in the wait function, so I follow its implementation.

However, my test maybe insufficient and I maybe wrong in my code review..


Good catch!

However, if this failure was seen on x86, there is something else going
on as well.

Why do you say that?  Isn't R-pattern reordering observable on x86?
It involves a write followed by a read, which is the sort of thing x86
is able to reorder.

You are right -- the smp_mb() is still needed on the write-then-read
side, and either a barrier(), an smp_wmb(), or an smp_store_release()
suffices for the pair-of-writes side.

Anyway, I don't know what type of system was used for testing.  The
patch description didn't say.

I used x86 machine to test this.


Alan Stern




Thanks a lot for reviewing the patch!

Thinh Nguyen
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux