On Tue, Nov 15, 2022 at 09:28:32PM +0100, Thomas Gleixner wrote: > Tearing down timers can be tedious when there are circular dependencies to > other things which need to be torn down. A prime example is timer and > workqueue where the timer schedules work and the work arms the timer. > > Steven and the Google Chromebook team ran into such an issue in the > Bluetooth HCI code. > > Steven suggested to create a new function del_timer_free() which marks the > timer as shutdown. Rearm attempts of shutdown timers are discarded and he > wanted to emit a warning for that case: > > https://lore.kernel.org/all/20220407161745.7d6754b3@xxxxxxxxxxxxxxxxxx > > This resulted in a lengthy discussion and suggestions how this should be > implemented. The patch series went through several iterations and during > the review of the last version it turned out that this approach is > suboptimal: > > https://lore.kernel.org/all/20221110064101.429013735@xxxxxxxxxxx > > The warning is not really helpful because it's entirely unclear how it > should be acted upon. The only way to address such a case is to add 'if > (in_shutdown)' conditionals all over the place. This is error prone and in > most cases of teardown like the HCI one which started this discussion not > required all. > > What needs to prevented is that pending work which is drained via > destroy_workqueue() does not rearm the previously shutdown timer. Nothing > in that shutdown sequence relies on the timer being functional. > > The conclusion was that the semantics of timer_shutdown_sync() should be: > > - timer is not enqueued > - timer callback is not running > - timer cannot be rearmed > > Preventing the rearming of shutdown timers is done by discarding rearm > attempts silently. > > As Steven is short of cycles, I made some spare cycles available and > reworked the patch series to follow the new semantics and plugged the races > which were discovered during review. > > The patches have been split up into small pieces to make review easier and > I took the liberty to throw a bunch of overdue cleanups into the picture > instead of proliferating the existing state further. > > The last patch in the series addresses the HCI teardown issue for real. > I applied the series to the top of v6.1-rc5, and also applied the result of running the coccinelle script to auto-convert simple cases. Running this set of patches through my testbed showed no build errors, runtime failures, or warnings. I also backported the series to chromeos-5.15, again applied the coccinelle generated patches, and ran it through a regression test. No failures either. With that, for the series, Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx> Let me know if I should send individual tags for each patch in the series. Thanks, Guenter