On Fri, Apr 24, 2020 at 08:28:09AM -0700, Divya Indi wrote: > If we look at the query, it does not appear to be a valid ib_sa_query. Instead > looks like a pid struct for a process -> Use-after-free situation. > > We could simulate the crash by explicitly introducing a delay in ib_nl_snd_msg with > a sleep. The timer kicks in before ib_nl_send_msg has even sent out the request > and releases the query. We could reproduce the crash with a similar stack trace. > > To summarize - We have a use-after-free possibility here when the timer(ib_nl_request_timeout) > kicks in before ib_nl_snd_msg has completed sending the query out to ibacm via netlink. The > timeout handler ie ib_nl_request_timeout may result in releasing the query while ib_nl_snd_msg > is still accessing query. > > Appreciate your thoughts on the above issue. Your assesment looks right to me. Fixing it will require being able to clearly explain what the lifetime cycle is for ib_sa_query - and what is there today looks like a mess, so I'm not sure what it should be changed into. There is lots of other stuff there that looks really weird, like ib_sa_cancel_query() keeps going even though there are still timers running.. Jason