Re: Gadgetfs's use of spinlocks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 28 Apr 2009, David Brownell wrote:

> On Thursday 23 April 2009, Alan Stern wrote:
> > Dave:
> > 
> > A bug in gadgetfs has persisted for quite a while: nested spin_locks
> > when used with net2280.
> 
> Did you just happen to notice this when trying the latest
> PTP code?  :)

As a matter of fact, I first noticed it quite a long time ago, long
enough that I can't remember exactly when.  But recent bug reports and
in particular the PTP stuff jogged my memory.

> > This is because inode.c calls usb_ep_queue() 
> > for ep0 while holding the device lock, and ep0_complete() tries to
> > acquire the lock.  Since net2280 will call the completion routine
> > directly during submission if the request is small enough to fit
> > entirely in a FIFO, we end up in deadlock.  (Dummy_hcd may behave the
> > same way; I haven't checked.)
> 
> Yeah, locking there has always been pretty nasty.
> 
> There were a surprising number of code paths that
> could tromp on each other ... some of them may now
> have vanished, since the ability to implement a
> userspace gadget without ioctls has been removed.

The degree of complication was sufficiently high that I never did fully
grok that driver.  Which is why I'm asking for help now...

> Note that your subject is misleading.  There's
> only *one* spinlock ... :)

You're right.  The subject line should have read "Gadgetfs's use of 
spin_lock()".  :-)

> > The patch below attempts to fix the problem by dropping the lock around 
> > the calls to usb_ep_queue() for ep0, but I don't know whether doing 
> > this might lead to other problems.
> 
> The comment in the "struct ep_data" definition is that
> the spinlock must be held to access ep or req.
> 
> Now, digging back into the foggier parts of my memory,
> the really nasty cases there involved cleanup paths.
> 
> As in, close file descriptor ==> gadgetfs_unbind(),
> unplug cable from host ==> gadgetfs_disconnect(), and
> both of those could happen concurrently with respect
> to request completion or submission... yep, three
> concurrent nastinesses to defend against.  Luckily
> it was easy enough to test each singly, and along
> with any one of the others.

"Close file descriptor" is the only unusual event; unplug, submission,
and completion occur with every gadget driver.  Does that make it any
easier to think about?  Probably not...

> A better solution than that spinlock may be needed.
> Maybe a parallel state machine.

Or refactoring the ep0 completion handler, moving more of the work into 
the submitter thread.

> > I suppose the places inside  
> > gadgetfs_setup() are probably okay, but what about the call inside 
> > ep0_read()?  There's a possibility that the host might send a new 
> > SETUP packet while waiting for the userspace program to get around to 
> > reading the OUT data from the previous transfer.
> 
> Yeah, ep0 in particular has additional nasty cases that
> none of the other endpoints had.  Like those.
> 
>  
> > Can you look this over and figure out what more (if anything) needs to 
> > be done?
> 
> It'll hurt, but ... I'll try.

Sorry for dumping this in your lap...  At least it's a short patch!  :-)

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux