On Sun. 13 Feb 2022 at 00:57, Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote: > On 12.02.2022 20:27:13, Vincent Mailhol wrote: > > The driver uses an atomic_t variable: es58x_device:opened_channel_cnt > > to keep track of the number of opened channels in order to only > > allocate memory for the URBs when this count changes from zero to one. > > > > While the intent was to prevent race conditions, the choice of an > > atomic_t turns out to be a bad idea for several reasons: > > > > - implementation is incorrect and fails to decrement > > opened_channel_cnt when the URB allocation fails as reported in > > [1]. > > > > - even if opened_channel_cnt were to be correctly decremented, > > atomic_t is insufficient to cover edge cases: there can be a race > > condition in which 1/ a first process fails to allocate URBs > > memory 2/ a second process enters es58x_open() before the first > > process does its cleanup and decrements opened_channed_cnt. In > > which case, the second process would successfully return despite > > the URBs memory not being allocated. > > > > - actually, any kind of locking mechanism was useless here because > > it is redundant with the network stack big kernel lock > > (a.k.a. rtnl_lock) which is being hold by all the callers of > > net_device_ops:ndo_open() and net_device_ops:ndo_close(). c.f. the > > ASSERST_RTNL() calls in __dev_open() [2] and __dev_close_many() > > [3]. > > > > The atmomic_t is thus replaced by a simple u8 type and the logic to > > increment and decrement es58x_device:opened_channel_cnt is simplified > > accordingly fixing the bug reported in [1]. We do not check again for > > ASSERST_RTNL() as this is already done by the callers. > > > > [1] https://lore.kernel.org/linux-can/20220201140351.GA2548@kili/T/#u > > [2] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1463 > > [3] https://elixir.bootlin.com/linux/v5.16/source/net/core/dev.c#L1541 > > > > Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X > > CAN USB interfaces") > > Reported-by: Dan Carpenter <dan.carpenter@xxxxxxxxxx> > > Signed-off-by: Vincent Mailhol <mailhol.vincent@xxxxxxxxxx> > > Applied to can/testing. > > I you (or someone else) wants to increase their patch count feel free to > convert the other USB CAN drivers from atomic_t to u8, too. Actually, not so many drivers are impacted: | $ grep -R atomic_t drivers/net/can/ | drivers/net/can/c_can/c_can.h: atomic_t sie_pending; | drivers/net/can/usb/esd_usb2.c: atomic_t active_tx_jobs; | drivers/net/can/usb/ems_usb.c: atomic_t active_tx_urbs; | drivers/net/can/usb/gs_usb.c: atomic_t active_tx_urbs; | drivers/net/can/usb/gs_usb.c: atomic_t active_channels; | drivers/net/can/usb/mcba_usb.c: atomic_t free_ctx_cnt; | drivers/net/can/usb/usb_8dev.c: atomic_t active_tx_urbs; | drivers/net/can/usb/peak_usb/pcan_usb_core.h: atomic_t active_tx_urbs; | drivers/net/can/usb/etas_es58x/es58x_core.h: atomic_t tx_urbs_idle_cnt; | drivers/net/can/usb/etas_es58x/es58x_core.c: atomic_t *idle_cnt = &es58x_dev->tx_urbs_idle_cnt; The only relevant one seems to be the gs_usb with its atomic_t active_channels. I looked at the code, the change to u8 shouldn’t be too hard. But aside from that, I am also concerned by the absence of an exit path in gs_can_open() to free the allocated URB memory when an error occurs. I will send a patch to change the active_channels from atomic_t to u8, however, I will not rework the error path to free the allocated URB memory. Also, we need to double check that none of the drivers uses a spinlock or mutex in their open() or close() functions. I gave it a first glance and didn't find anything outstanding but I will need to spend a bit of extra time on that to confirm. Yours sincerely, Vincent Mailhol