Hi Steve, > On Oct 28, 2022, at 14:50, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > Trond, > > I'm looking at a commit from 2005: > > 0f9dc2b16884b ("RPC: Clean up socket autodisconnect") > > Cancel autodisconnect requests inside xprt_transmit() in order to avoid > races. > Use more efficient del_singleshot_timer_sync() > > > I'm working on adding a "shutdown" state to timers, making it required for > freeing the timer. This is to address the numerous bugs we hit where timers > get rearmed just before freeing and then cause a crash in the timer code, > without knowing what timer it was that caused it. > > Having a specific shutdown state for timers will remove this problem > because if something tries to rearm a shutdown timer, it will fail and a > WARN_ON_ONCE() is triggered. See below in the "reply" part for a > description of this effort. > > The reason for this email, is because that WARN_ON_ONCE() triggered on the > mod_timer() from: > > static void > xprt_schedule_autodisconnect(struct rpc_xprt *xprt) > __must_hold(&xprt->transport_lock) > { > xprt->last_used = jiffies; > if (RB_EMPTY_ROOT(&xprt->recv_queue) && xprt_has_timer(xprt)) > mod_timer(&xprt->timer, xprt->last_used + xprt->idle_timeout); > } > > That's because xptr->timer was shutdown due to: > > int > xprt_request_enqueue_receive(struct rpc_task *task) > { > [..] > /* Turn off autodisconnect */ > del_singleshot_timer_sync(&xprt->timer); > return 0; > } > > Now singleshot means just that. It's a single shot and calling > del_singleshot_timer_sync() will shut it down so that it can be freed. That > also means that it can no longer be re-armed. > > I'm not sure what you meant by "Use more efficient del_singleshot_timer_sync()" > but I'm guessing since that was written in 2005, it is no longer relevant, > and del_timer_sync() should now be used. > > After replacing that with del_timer_sync(), the warning goes away. > > I just want to confirm that's OK with you. I seem to vaguely remember that at the time, del_timer_sync() would loop in order to catch re-arming timers, whereas del_singleshot_timer_sync() would not, hence the commit message. The expectation for del_singleshot_timer_sync() was simply that the caller would ensure safety against re-arming, which was indeed the case for this code. However if that del_singleshot_timer_sync() expectation has been strengthened to mean that you guarantee never to re-arm the timer at all, then I agree that we should switch to del_timer_sync(). Thanks! Trond _________________________________ Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx