On Mon, 2022-03-14 at 09:03 +1100, NeilBrown wrote: > On Sun, 13 Mar 2022, Trond Myklebust wrote: > > On Tue, 2022-03-08 at 13:42 +1100, NeilBrown wrote: > > > > > > xprt_destory() claims XPRT_LOCKED and then calls > > > del_timer_sync(). > > > Both xprt_unlock_connect() and xprt_release() call > > > ->release_xprt() > > > which drops XPRT_LOCKED and *then* xprt_schedule_autodisconnect() > > > which calls mod_timer(). > > > > > > This may result in mod_timer() being called *after* > > > del_timer_sync(). > > > When this happens, the timer may fire long after the xprt has > > > been > > > freed, > > > and run_timer_softirq() will probably crash. > > > > > > The pairing of ->release_xprt() and > > > xprt_schedule_autodisconnect() is > > > always called under ->transport_lock. So if we take - > > > >transport_lock > > > to > > > call del_timer_sync(), we can be sure that mod_timer() will run > > > first > > > (if it runs at all). > > > > xprt_destroy() never releases XPRT_LOCKED, so how could the above > > race > > happen? > > the race would happen as xprt_destroy() is taking the lock. > One core is in xprt_destroy(), the other in xprt_unlock_connect() > > Core 1 Core 2 > spin_lock(transport_lock) > xprt->ops->release_xprt() > aka xprt_release_xprt() > This clears XPRT_LOCKED > wait_on_bit_lock(XPRT_LOCKED) > This doesn't need to wait > del_timer_sync(->timer) > xprt_schedule_autodisconnect() > calls mod_timer(->timer) > spin_unlock(transport_lock) > wake_up_bit(XPRT_LOCKED) > > The mod_timer() call being after del_timer_sync() is the problem. > > If the wait_on_bit_lock() was just a little earlier and had to wait, > it > wouldn't be woken until it was safe, so it is a small race window to > hit. > > An alternate fix might be to move the xprt_schedule_autodisconnect() > call before xprt->ops->release_xprt(), but as a spinlock was already > used to group these operations, I thought it simpler to use the > spinlock > to remove the race. > > The only evidence I have that it is possible at all is two crash > dumps > with run_timer_softirq() tripping over a corrupt timer. > The timer function is xprt_init_autodisconnect() > The ->next pointer is LIST_POISON2, so detach_timer() must have been > called on the timer, but it was still found in the list of pending > timers. > > It looks like the xprt was freed just after above race, so timer was > still active, then allocated again so timer was reinitialised, then > freed again, so del_timer_sync() as called and ->next became > LIST_POISON2. At the time of the crash, the memory was unallocated. > > I cannot be 100% certain this race is happening, but I cannot find > any > other path by which it is even vaguely possible. The above looks plausible, and I think this patch is the correct one. The one additional change I suggest is that we also move the wake_up_bit() in xprt_connect_unlock() inside the spinlock, so that we avoid a potential use-after-free. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx