On Tue, 14 Apr 2009 15:44:05 -0400 (EDT) Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote: > On Tue, 14 Apr 2009, Alan Cox wrote: > > > (Please direct tty layer patches to me as well as Greg as I'm the one > > currently trying to beat the tty layer into submission) > > Sorry; I didn't realize this would interact quite so much with the tty > layer. No problem, I doubt you could have deduced I was rewriting that code unless you followed the serial lists religiously and are telepathic. > > This entire code area has been significantly revamped in the devel code > > to use the core tty_port_* helper functions. > > Are there a lot of pending changes to usb-serial.c that aren't already > in Greg's tree? For the open/close/hangup path yes. However they are not in a state that will be ready for this release so your patches are very good news indeed and unlike the devel tree are backportable. > > I'll go over this patch in more detail tomorrow however. > > > > > - usb_serial_put(port->serial); > > > + put_device(&port->dev); > > > + > > > + /* Mustn't dereference port any more */ > > > + if (count == 0) { > > > > So why is the put_device just before this safe - it dereferences port ? > > The comment means that as a result of the put_device, port is now a > stale pointer. Should I rephrase it (or omit it entirely)? Oh so put_device potentially dropped the last reference ? It might be clearer to say so. > I'm very confused. By "the tty port kref" do you mean > port->dev.kobject.kref, or do you mean something else? > > By "the USB serial device level destructor" do you mean the > destroy_serial() routine? (Just want to make sure I understand > correctly.) > > Is "a kref to the tty" different from "the tty port kref"? There are two objects with differing lifetimes. struct tty_struct exists for the lifetime of a use of the port (ie while it is open). Multiple opens use the same tty_struct bind to the same struct tty. struct tty_port is embedded in struct usb_port and exists for the lifetime of the hardware (until it is unplugged, the driver unloaded etc). The direction to the tty layer is heading in is to have a tty_port object representing every port of all types. Historically each hardware or type of hardware had a unique object for the hardware side which caused massive code copying and makes it hard to change stuff. > In fact, I don't see how this cancel_work_sync() call can be an issue > at all. Is there anything wrong with using it in place of > flush_scheduled_work()? For the current fix nothing at all. I was thinking more in the longer term. > > The easiest way to handle this is to kill the entire disc_mutex. It > > serves no real purpose except to serialize a one shot event and report > > it. You can do that with test_and_set_bit() > > > > In usb_serial_disconnect() > > > > if(test_and_set_bit(...)) > > return; > > > > and test the bit in the disconnect path > > I don't get it. What purpose does test_and_set_bit() serve if it > occurs in only one place? We can get disconnects from multiple sources in parallel as far as I can tell (think about rmmod and unplug at the same time). If those are serialized then it's even simpler. > > usb_serial_disconnect > > t&sb blocks recursion > > calls hangup > > kills hardware in the recursion blocked path > > How does hangup go about killing the hardware? Does it somehow arrange > for usb_serial_disconnect() to be called again? The flow with the correct POSIX logic is that on the last close() the hardware and tty are disassociated (I guess killing the hardware is misleading as a term when USB lets you kill it more drastically as in take it offline/unplug it). The same occurs earlier if the device gets a hang up event. After that any existing access will fail, but new access may succeed in re-opening the port. > The purpose of disc_mutex is to prevent the driver from doing I/O to a > device after they have been unbound. (It's important to remember that > a driver can be unbound from a device without the device being > unplugged.) > Without the mutex, there's a race between usb_serial_disconnect() and > the subdriver's open method. Does tty_hangup() resolve this race > somehow? For the unbind case possibly not. If the unbind calls usb_serial_disconnect then it should do so as after the tty_hangup completes. However tty_hangup is itself asynchronous and there can be hanging tty references. It is thus possible that the line discipline can call back into the USB driver write method (and one or two other functions) after usb_serial_disconnect completes, on both the old and updated code. This in fact happens at the moment and there are a few traces in kerneloops showing the race is actually hit. This is actually a general problem with the tty layer and hotplug devices that I am still scratching my head over somewhat. disc_mutex is insufficient because of scenarios like CPU 0 CPU 1 disconnect begins character arrives (takes kref) mutex taken and dropped kref keeps tty open character is echoed write method of device is called kref dropped object actually freed Also as tty_hangup is asynchronous you get the simpler case of disconnect begins mutex taken tty_hangup queued mutex_dropped disconnect returns Line discipline writes characters to the port tty_hangup completes The tty hangup has to be asynchronous as it takes all sorts of core file system locks and walks file handle lists changing the operations for the handle. I far this is going to force some problematic games with refcounting as in some cases the tty needs to anchor the port in memory and in some cases the port needs to anchor the tty in memory. Alan -- To unsubscribe from this list: send the line "unsubscribe linux-usb" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html