Re: [PATCH] usb-serial: fix lifetime and locking problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 14 Apr 2009 15:44:05 -0400 (EDT)
Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 14 Apr 2009, Alan Cox wrote:
> 
> > (Please direct tty layer patches to me as well as Greg as I'm the one
> > currently trying to beat the tty layer into submission)
> 
> Sorry; I didn't realize this would interact quite so much with the tty 
> layer.

No problem, I doubt you could have deduced I was rewriting that code
unless you followed the serial lists religiously and are telepathic.

> > This entire code area has been significantly revamped in the devel code
> > to use the core tty_port_* helper functions.
> 
> Are there a lot of pending changes to usb-serial.c that aren't already 
> in Greg's tree?

For the open/close/hangup path yes. However they are not in a state that
will be ready for this release so your patches are very good news indeed
and unlike the devel tree are backportable.

> > I'll go over this patch in more detail tomorrow however.
> > 
> > > -	usb_serial_put(port->serial);
> > > +	put_device(&port->dev);
> > > +
> > > +	/* Mustn't dereference port any more */
> > > +	if (count == 0) {
> > 
> > So why is the put_device just before this safe - it dereferences port ?
> 
> The comment means that as a result of the put_device, port is now a 
> stale pointer.  Should I rephrase it (or omit it entirely)?

Oh so put_device potentially dropped the last reference ? It might be
clearer to say so.

> I'm very confused.  By "the tty port kref" do you mean 
> port->dev.kobject.kref, or do you mean something else?
> 
> By "the USB serial device level destructor" do you mean the
> destroy_serial() routine?  (Just want to make sure I understand
> correctly.)
> 
> Is "a kref to the tty" different from "the tty port kref"?

There are two objects with differing lifetimes.

struct tty_struct exists for the lifetime of a use of the port (ie while
it is open). Multiple opens use the same tty_struct bind to the same
struct tty.

struct tty_port is embedded in struct usb_port and exists for the
lifetime of the hardware (until it is unplugged, the driver unloaded
etc). The direction to the tty layer is heading in is to have a tty_port
object representing every port of all types. Historically each hardware
or type of hardware had a unique object for the hardware side which
caused massive code copying and makes it hard to change stuff.

> In fact, I don't see how this cancel_work_sync() call can be an issue
> at all.  Is there anything wrong with using it in place of
> flush_scheduled_work()?

For the current fix nothing at all. I was thinking more in the longer
term.

> > The easiest way to handle this is to kill the entire disc_mutex. It
> > serves no real purpose except to serialize a one shot event and report
> > it. You can do that with test_and_set_bit()
> > 
> > In usb_serial_disconnect()
> > 
> > 	if(test_and_set_bit(...))
> > 		return;
> > 
> > and test the bit in the disconnect path
> 
> I don't get it.  What purpose does test_and_set_bit() serve if it 
> occurs in only one place?

We can get disconnects from multiple sources in parallel as far as I can
tell (think about rmmod and unplug at the same time). If those are
serialized then it's even simpler.


> >            usb_serial_disconnect
> >                   t&sb blocks recursion
> >                         calls hangup
> >                              kills hardware in the recursion blocked path
> 
> How does hangup go about killing the hardware?  Does it somehow arrange 
> for usb_serial_disconnect() to be called again?

The flow with the correct POSIX logic is that on the last close() the
hardware and tty are disassociated (I guess killing the hardware is
misleading as a term when USB lets you kill it more drastically as in
take it offline/unplug it). The same occurs earlier if the device gets a
hang up event. After that any existing access will fail, but new access
may succeed in re-opening the port.

> The purpose of disc_mutex is to prevent the driver from doing I/O to a
> device after they have been unbound.  (It's important to remember that
> a driver can be unbound from a device without the device being
> unplugged.)

> Without the mutex, there's a race between usb_serial_disconnect() and
> the subdriver's open method.  Does tty_hangup() resolve this race 
> somehow?

For the unbind case possibly not. If the unbind calls
usb_serial_disconnect then it should do so as after the tty_hangup
completes. However tty_hangup is itself asynchronous and there can be
hanging tty references. It is thus possible that the line discipline can
call back into the USB driver write method (and one or two other
functions) after usb_serial_disconnect completes, on both the old and
updated code. This in fact happens at the moment and there are a few
traces in kerneloops showing the race is actually hit.

This is actually a general problem with the tty layer and hotplug devices
that I am still scratching my head over somewhat. disc_mutex is
insufficient because of scenarios like

	CPU 0				CPU 1

	disconnect begins		character arrives (takes kref)
        mutex taken and dropped

	kref keeps tty open
					character is echoed
					write method of device is called
					kref dropped
					object actually freed

Also as tty_hangup is asynchronous you get the simpler case of

	disconnect begins
	mutex taken
	tty_hangup queued
	mutex_dropped
	disconnect returns
	Line discipline writes characters to the port
	tty_hangup completes


The tty hangup has to be asynchronous as it takes all sorts of core file
system locks and walks file handle lists changing the operations for the
handle. I far this is going to force some problematic games with
refcounting as in some cases the tty needs to anchor the port in memory
and in some cases the port needs to anchor the tty in memory.


Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux