Hi, On 22/04/24 16:31, Valentin Schneider wrote: > Apologies for the delayed reply, I was away for most of last week; > > On 16/04/24 17:01, Eric Dumazet wrote: >> On Mon, Apr 15, 2024 at 4:33 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote: >>> >>> On 15/04/24 14:35, Eric Dumazet wrote: >>> > On Mon, Apr 15, 2024 at 1:34 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote: >>> >> v4 -> v5 >>> >> ++++++++ >>> >> >>> >> o Rebased against latest Linus' tree >>> >> o Converted tw_timer into a delayed work following Jakub's bug report on v4 >>> >> http://lore.kernel.org/r/20240411100536.224fa1e7@xxxxxxxxxx >>> > >>> > What was the issue again ? >>> > >>> > Please explain precisely why it was fundamentally tied to the use of >>> > timers (and this was not possible to fix the issue without >>> > adding work queues and more dependencies to TCP stack) >>> >>> In v4 I added the use of the ehash lock to serialize arming the timewait >>> timer vs destroying it (inet_twsk_schedule() vs inet_twsk_deschedule_put()). >>> >>> Unfortunately, holding a lock both in a timer callback and in the context >>> in which it is destroyed is invalid. AIUI the issue is as follows: >>> >>> CPUx CPUy >>> spin_lock(foo); >>> <timer fires> >>> call_timer_fn() >>> spin_lock(foo) // blocks >>> timer_shutdown_sync() >>> __timer_delete_sync() >>> __try_to_del_timer_sync() // looped as long as timer is running >>> <deadlock> >>> >>> In our case, we had in v4: >>> >>> inet_twsk_deschedule_put() >>> spin_lock(ehash_lock); >>> tw_timer_handler() >>> inet_twsk_kill() >>> spin_lock(ehash_lock); >>> __inet_twsk_kill(); >>> timer_shutdown_sync(&tw->tw_timer); >>> >>> The fix here is to move the timer deletion to a non-timer >>> context. Workqueues fit the bill, and as the tw_timer_handler() would just queue >>> a work item, I converted it to a delayed_work. Does this explanation make sense? This is the reasoning that drove me to involve workqueues. I'm open to suggestions on alternative approaches.