Re: [PATCH for-rc] RDMA/cma: fix race between addr_handler and resolve_route

Håkon Bugge <haakon.bugge@xxxxxxxxxx> · Mon, 6 Apr 2020 20:02:41 +0200

> On 6 Apr 2020, at 19:31, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
> 
> On Mon, Apr 06, 2020 at 07:00:46PM +0200, Håkon Bugge wrote:
>> 
>> 
>>> On 3 Apr 2020, at 21:36, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
>>> 
>>> On Fri, Apr 03, 2020 at 09:07:10PM +0200, Håkon Bugge wrote:
>>>> 
>>>> 
>>>>> On 3 Apr 2020, at 20:57, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
>>>>> 
>>>>> On Fri, Apr 03, 2020 at 08:43:28PM +0200, Håkon Bugge wrote:
>>>>>> A syzkaller test hits a NULL pointer dereference in
>>>>>> rdma_resolve_route():
>>>>> 
>>>>> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
>>>>> 
>>>>> This commit in 5.7 probably fixes this:
>>>> 
>>>> I think it will not. The mutex in 7c11910783a1 ("RDMA/ucma: Put a
>>>> lock around every call to the rdma_cm layer") will not prevent
>>>> addr_handler() to run concurrently with rdma_resolve_route(), right?
>>> 
>>> Hmm. Perhaps so. But your patch isn't nearly enough if that is the
>>> case, you've only considered resolve_route, but it could run
>>> concurrently with *anything*, with the usual problems.
>> 
>> I was about to argue my case, but looking more into the code I tend
>> to agree with you more and more. I thought, at least, that the
>> rdma_foo_bar() functions had atomic checks on the state and bailed
>> out if the state was inappropriate, but that is not the case all the
>> time.
>> 
>> And the code that transitions the state from FOO to BAR, letting
>> others observe BAR, and then conditionally setting it back to FOO
>> isn't exactly great...
> 
> Ah, I did remember it right then :)
> 
>>> Probably the simplest answer is to have ucma fail operations that are
>>> not permitted while an async_handler is pending. I'm guessing the only
>>> operation that would be valid is rdma_destroy_id?
>> 
>> Yes, that would take care of this class of errors, I agree. On a
>> real systems (vs. the syzkaller setup), you may receive (almost any)
>> CM packet ay any time, and that concurrency we must handle.
> 
> As I understand it, a correct application has to manage the RDMA CM
> state machine, so if it has launched a background resolution then it
> must complete/stop that before going on to do something else?
> 
> ie this is why the kernel ULPs theoretically don't need the extra
> locking in ucm?

We expect kernel ULPs to behave as good citizens. For example, we expect them to get the ADDR_RESOLVED event _before_ calling resolve_route() (as opposed to this syzkaller repro).

> However, I'm not sure what the state machine is supposed to look
> like..

The neat state machines in Figure 13{5,6} in IBTA version 1.3. If you do not have it handy, I can send you the PDF off list.

>> Shall I make a v2 base on next based on this idea, or do you have
>> something coming?
> 
> Sure, I have nothing :)
> 
> Also that rdma_destroy_id in addr_handler looks wrong.. ie we still
> retain pointers to the rdma_cm_id it destroys inside the struct
> ucma_context, don't we?

On entry from user-space, we use the u32 id and looks it up in the XArray. But if rdma_destoy_id() is called asynchronously called between ucma_get_ctx_dev() and the de-reference of ctx->cm_id, we are toast.

>>> Until that time we are taking a 'Big Lock' approach to all concurrancy
>>> problems with rdma_cm as this code is *completely* broken for
>>> concurrency.
>> 
>> Have you considered putting the hammer inside the cm_id instead of
>> the context? Would that allow more concurrency and still avoid the
>> 10ish syzkaller bugs?
> 
> It is, isn't it? Here the 'struct ucma_context *' is actually
> per-cm_id, confusing right?

Must be, since it has a single "u32 id" ;-)

Håkon