Re: [PATCH] usb-serial: fix lifetime and locking problems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 14 Apr 2009, Alan Cox wrote:

> > Sorry; I didn't realize this would interact quite so much with the tty 
> > layer.
> 
> No problem, I doubt you could have deduced I was rewriting that code
> unless you followed the serial lists religiously and are telepathic.

Well, I'm not telepathic with anyone more than a few feet away.  :-)  
(Which reminds me of an old joke of Isaac Asimov's: He said that he had
a near-photographic memory -- he could remember anything that was near
a photograph.)


> There are two objects with differing lifetimes.
> 
> struct tty_struct exists for the lifetime of a use of the port (ie while
> it is open). Multiple opens use the same tty_struct bind to the same
> struct tty.
> 
> struct tty_port is embedded in struct usb_port and exists for the
> lifetime of the hardware (until it is unplugged, the driver unloaded
> etc). The direction to the tty layer is heading in is to have a tty_port
> object representing every port of all types. Historically each hardware
> or type of hardware had a unique object for the hardware side which
> caused massive code copying and makes it hard to change stuff.

Hmm.  Why are there two separate objects?  Is this because multiple
tty_struct's can refer to the same tty_port concurrently?  Or is it a 
lifetime issue -- some of the information in tty_port is needed even 
when there are no associated tty_structs?


> We can get disconnects from multiple sources in parallel as far as I can
> tell (think about rmmod and unplug at the same time). If those are
> serialized then it's even simpler.

The driver core does serialize unbind events, and the driver's release
method is called only once.  Thus usb_serial_disconnect() doesn't have
to worry about concurrent invocations.

> The flow with the correct POSIX logic is that on the last close() the
> hardware and tty are disassociated (I guess killing the hardware is
> misleading as a term when USB lets you kill it more drastically as in
> take it offline/unplug it). The same occurs earlier if the device gets a
> hang up event. After that any existing access will fail, but new access
> may succeed in re-opening the port.

Understood.

> > The purpose of disc_mutex is to prevent the driver from doing I/O to a
> > device after they have been unbound.  (It's important to remember that
> > a driver can be unbound from a device without the device being
> > unplugged.)
> 
> > Without the mutex, there's a race between usb_serial_disconnect() and
> > the subdriver's open method.  Does tty_hangup() resolve this race 
> > somehow?
> 
> For the unbind case possibly not. If the unbind calls
> usb_serial_disconnect then it should do so as after the tty_hangup
> completes.

In fact it's the other way around: usb_serial_disconnect calls 
tty_hangup.

>  However tty_hangup is itself asynchronous and there can be
> hanging tty references. It is thus possible that the line discipline can
> call back into the USB driver write method (and one or two other
> functions) after usb_serial_disconnect completes, on both the old and
> updated code. This in fact happens at the moment and there are a few
> traces in kerneloops showing the race is actually hit.
> 
> This is actually a general problem with the tty layer and hotplug devices
> that I am still scratching my head over somewhat. disc_mutex is
> insufficient because of scenarios like
> 
> 	CPU 0				CPU 1
> 
> 	disconnect begins		character arrives (takes kref)
>         mutex taken and dropped
> 
> 	kref keeps tty open
> 					character is echoed
> 					write method of device is called
> 					kref dropped
> 					object actually freed
> 
> Also as tty_hangup is asynchronous you get the simpler case of
> 
> 	disconnect begins
> 	mutex taken
> 	tty_hangup queued
> 	mutex_dropped
> 	disconnect returns
> 	Line discipline writes characters to the port
> 	tty_hangup completes

This is a general problem for all hot-unpluggable devices.  Somehow the 
driver has to synchronize its I/O activity with its unbind routine.

For the tty layer, you want tty_hangup (the synchronous part) to insure 
that the line discipline -- or any other part of the tty core -- won't 
send any more characters to the port.  Ditto for other I/O activities, 
such as baud-rate setting, throttling, etc.  But I get the impression 
that character output is the hard part.

Clearly disc_mutex can't be used for this purpose, since it is private 
to the usb-serial layer.  This has to be done in the tty core.  
Something (a spinlock?) has to synchronize tty_hangup with the line 
discipline.

> The tty hangup has to be asynchronous as it takes all sorts of core file
> system locks and walks file handle lists changing the operations for the
> handle. I far this is going to force some problematic games with
> refcounting as in some cases the tty needs to anchor the port in memory
> and in some cases the port needs to anchor the tty in memory.

When it's really necessary, the tty can anchor the port with a simple 
reference; children anchoring their parents is normal and done all the 
time.  Going the other way is more problematic; in particular you must 
not rely on the port's destructor dropping a reference to the tty.  But 
any time earlier should be okay.  For instance, if the port can drop 
its reference when it is unregistered then there won't be any problem.  
(The same sort of issue surfaced in the SCSI core a while back.)

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux