Re: [PATCH v4 2/6] can: etas_es58x: add devlink support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 28, 2022 at 02:32:23PM +0900, Vincent MAILHOL wrote:
> On Mon. 28 Nov. 2022 at 10:34, Vincent MAILHOL
> <mailhol.vincent@xxxxxxxxxx> wrote:
> > On Mon. 28 Nov. 2022 at 00:41, Alan Stern <stern@xxxxxxxxxxxxxxxxxxx> wrote:
> > > On Sun, Nov 27, 2022 at 02:10:32PM +0900, Vincent MAILHOL wrote:
> > > > > Should devlink_free() be after usb_set_inftdata()?
> > > >
> > > > A look at
> > > >   $ git grep -W "usb_set_intfdata(.*NULL)"
> > > >
> > > > shows that the two patterns (freeing before or after
> > > > usb_set_intfdata()) coexist.
> > > >
> > > > You are raising an important question here. usb_set_intfdata() does
> > > > not have documentation that freeing before it is risky. And the
> > > > documentation of usb_driver::disconnect says that:
> > > >   "@disconnect: Called when the interface is no longer accessible,
> > > >    usually because its device has been (or is being) disconnected
> > > >    or the driver module is being unloaded."
> > > >   Ref: https://elixir.bootlin.com/linux/v6.1-rc6/source/include/linux/usb.h#L1130
> > > >
> > > > So the interface no longer being accessible makes me assume that the
> > > > order does not matter. If it indeed matters, then this is a foot gun
> > > > and there is some clean-up work waiting for us on many drivers.
> > > >
> > > > @Greg, any thoughts on whether or not the order of usb_set_intfdata()
> > > > and resource freeing matters or not?
> > >
> > > In fact, drivers don't have to call usb_set_intfdata(NULL) at all; the
> > > USB core does it for them after the ->disconnect() callback returns.
> >
> > Interesting. This fact is widely unknown, cf:
> >   $ git grep "usb_set_intfdata(.*NULL)" | wc -l
> >   215
> >
> > I will do some clean-up later on, at least for the CAN USB drivers.
> >
> > > But if a driver does make the call, it should be careful to ensure that
> > > the call happens _after_ the driver is finished using the interface-data
> > > pointer.  For example, after all outstanding URBs have completed, if the
> > > completion handlers will need to call usb_get_intfdata().
> >
> > ACK. I understand that it should be called *after* the completion of
> > any ongoing task.
> >
> > My question was more on:
> >
> >         devlink_free(priv_to_devlink(es58x_dev));
> >         usb_set_intfdata(intf, NULL);
> >
> > VS.
> >
> >         usb_set_intfdata(intf, NULL);
> >         devlink_free(priv_to_devlink(es58x_dev));
> >
> > From your comments, I understand that both are fine.
> 
> Do we agree that the usb-skeleton is doing it wrong?
>   https://elixir.bootlin.com/linux/latest/source/drivers/usb/usb-skeleton.c#L567
> usb_set_intfdata(interface, NULL) is called before deregistering the
> interface and terminating the outstanding URBs!

Going through the usb-skeleton.c source code, you will find that 
usb_get_intfdata() is called from only a few routines:

	skel_open()
	skel_disconnect()
	skel_suspend()
	skel_pre_reset()
	skel_post_reset()

Of those, all but the first are called only by the USB core and they are 
mutually exclusive with disconnect processing (except for 
skel_disconnect() itself, of course).  So they don't matter.

The first, skel_open(), can be called as a result of actions by the 
user, so the driver needs to ensure that this can't happen after it 
clears the interface-data pointer.  The user can open the device file at 
any time before the minor number is given back, so it is not proper to 
call usb_set_intfdata(interface, NULL) before usb_deregister_dev() -- 
but the driver does exactly this!

(Well, it's not quite that bad.  skel_open() does check whether the 
interface-data pointer value it gets from usb_get_intfdata() is NULL.  
But it's still a race.)

So yes, the current code is wrong.  And in fact, it will still be wrong 
even after the usb_set_intfdata(interface, NULL) line is removed, 
because there is no synchronization between skel_open() and 
skel_disconnect().  It is possible for skel_disconnect() to run to 
completion and the USB core to clear the interface-data pointer all 
while skel_open() is running.  The driver needs a static private mutex 
to synchronize opens with unregistrations.  (This is a general 
phenomenon, true of all drivers that have a user interface such as a 
device file.)

The driver _does_ have a per-instance mutex, dev->io_mutex, to 
synchronize I/O with disconnects.  But that's separate from 
synchronizing opens with unregistrations, because at open time the 
driver doesn't yet know the address of the private data structure or 
even if the structure is still allocated.  So obviously it can't use a 
mutex that is embedded within the private data structure for this 
purpose.

Alan Stern



[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux