Re: [PATCH] Revert "RDMA/rxe: Remove unnecessary mr testing"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




on 12/2/2022 7:45 PM, Zhu Yanjun wrote:
> On Fri, Dec 2, 2022 at 7:02 PM Daisuke Matsuda
> <matsuda-daisuke@xxxxxxxxxxx> wrote:
>>
>> The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes
>> a kernel crash. If responder get a zero-byte RDMA Read request, qp->resp.mr
>> is not set in check_rkey(). The mr is NULL in this case, and a NULL pointer
>> dereference occurs as shown below.
>>
>> [  139.607580] BUG: kernel NULL pointer dereference, address: 0000000000000010
>> [  139.609169] #PF: supervisor write access in kernel mode
>> [  139.610314] #PF: error_code(0x0002) - not-present page
>> [  139.611434] PGD 0 P4D 0
>> [  139.612031] Oops: 0002 [#1] PREEMPT SMP PTI
>> [  139.612975] CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34
>> [  139.614465] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
>> [  139.616142] RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe]
>> [  139.617065] Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c
>> [  139.620451] RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246
>> [  139.621413] RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000
>> [  139.622718] RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000
>> [  139.624025] RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f
>> [  139.625328] R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000
>> [  139.626632] R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000
>> [  139.627941] FS:  00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000
>> [  139.629418] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  139.630480] CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0
>> [  139.631805] Call Trace:
>> [  139.632288]  <IRQ>
>> [  139.632688]  read_reply+0xda/0x310 [rdma_rxe]
>> [  139.633515]  rxe_responder+0x82d/0xe50 [rdma_rxe]
>> [  139.634398]  do_task+0x84/0x170 [rdma_rxe]
>> [  139.635187]  tasklet_action_common.constprop.0+0xa7/0x120
>> [  139.636189]  __do_softirq+0xcb/0x2ac
>> [  139.636877]  do_softirq+0x63/0x90
>> [  139.637505]  </IRQ>
>>
>> Link: https://lore.kernel.org/lkml/1666582315-2-1-git-send-email-lizhijian@xxxxxxxxxxx/
>> Signed-off-by: Daisuke Matsuda <matsuda-daisuke@xxxxxxxxxxx>

Good catch, want to know what workload you are running.
I have never got it in pyverbs tests.

Add a TODOs: add pyverbs test to cover this scenario.

Reviewed-by: Li Zhijian <lizhijian@xxxxxxxxxxx>



>> ---
>> NOTE:
>>   I think the commit 686d348476ee is not yet merged to Torvalds' tree.
>>   Perhaps we may just remove the patch from the for-next tree.
>>   I leave that to the maintainers as I am not familiar with patch reversion.
> 
> Sure. If this is for for-next, it had better add "[for-netx PATCH]
> Revert "RDMA/rxe: Remove unnecessary mr testing""
> 
> Thanks.
> Zhu Yanjun
> 
>>
>>   drivers/infiniband/sw/rxe/rxe_resp.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
>> index 6761bcd1d4d8..5d3a4c6f81a3 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c
>> @@ -832,7 +832,8 @@ static enum resp_states read_reply(struct rxe_qp *qp,
>>
>>          err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt),
>>                            payload, RXE_FROM_MR_OBJ);
>> -       rxe_put(mr);
>> +       if (mr)
>> +               rxe_put(mr);
>>          if (err) {
>>                  kfree_skb(skb);
>>                  return RESPST_ERR_RKEY_VIOLATION;
>> --
>> 2.31.1
>>




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux