On Fri, Apr 03, 2020 at 09:07:10PM +0200, Håkon Bugge wrote: > > > > On 3 Apr 2020, at 20:57, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote: > > > > On Fri, Apr 03, 2020 at 08:43:28PM +0200, Håkon Bugge wrote: > >> A syzkaller test hits a NULL pointer dereference in > >> rdma_resolve_route(): > > > > #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next > > > > This commit in 5.7 probably fixes this: > > I think it will not. The mutex in 7c11910783a1 ("RDMA/ucma: Put a > lock around every call to the rdma_cm layer") will not prevent > addr_handler() to run concurrently with rdma_resolve_route(), right? Hmm. Perhaps so. But your patch isn't nearly enough if that is the case, you've only considered resolve_route, but it could run concurrently with *anything*, with the usual problems. Plus addr_handler calls rdma_destroy_id().. Oh wow is that ever completely screwed up. Sigh. Probably the simplest answer is to have ucma fail operations that are not permitted while an async_handler is pending. I'm guessing the only operation that would be valid is rdma_destroy_id? > And, I also suspect 7c11910783a1 to have major performance > impact. But, that's a different story. *shrug* I no longer care. The work to fix this in a performant way is enormous and nobody wants to do it. Until that time we are taking a 'Big Lock' approach to all concurrancy problems with rdma_cm as this code is *completely* broken for concurrency. Which is why I'm not taking this patch.. Jason