On Fri, Apr 21, 2017 at 05:51:25PM +0200, Jon Maloy wrote: > From: Jon Paul Maloy <jon.maloy@xxxxxxxxxxxx> > > commit d25a01257e422a4bdeb426f69529d57c73b235fe upstream > > When the TIPC module is unloaded, we have identified a race condition > that allows a node reference counter to go to zero and the node instance > being freed before the node timer is finished with accessing it. This > leads to occasional crashes, especially in multi-namespace environments. > > The scenario goes as follows: > > CPU0:(node_stop) CPU1:(node_timeout) // ref == 2 > > 1: if(!mod_timer()) > 2: if (del_timer()) > 3: tipc_node_put() // ref -> 1 > 4: tipc_node_put() // ref -> 0 > 5: kfree_rcu(node); > 6: tipc_node_get(node) > 7: // BOOM! > > We now clean up this functionality as follows: > > 1) We remove the node pointer from the node lookup table before we > attempt deactivating the timer. This way, we reduce the risk that > tipc_node_find() may obtain a valid pointer to an instance marked > for deletion; a harmless but undesirable situation. > > 2) We use del_timer_sync() instead of del_timer() to safely deactivate > the node timer without any risk that it might be reactivated by the > timeout handler. There is no risk of deadlock here, since the two > functions never touch the same spinlocks. > > 3: We remove a pointless tipc_node_get() + tipc_node_put() from the > timeout handler. > > Reported-by: Zhijiang Hu <huzhijiang@xxxxxxxxx> > Acked-by: Ying Xue <ying.xue@xxxxxxxxxxxxx> > Signed-off-by: Jon Maloy <jon.maloy@xxxxxxxxxxxx> > Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> > --- > net/tipc/node.c | 24 +++++++++++------------- > 1 file changed, 11 insertions(+), 13 deletions(-) Now queued up, thanks. greg k-h