Re: Linux kernel v4.15-rc4 and rdma_rxe

Bart Van Assche <Bart.VanAssche@xxxxxxx> · Wed, 3 Jan 2018 00:53:21 +0000

On Tue, 2018-01-02 at 19:44 +0200, Moni Shoua wrote:
> This is a great input for the debugger (whoever that be). From a brief
> look at the code I see that error QP is checked when during the
> validation of  RDMA_WRITE request. In this case a completion is
> generated and the size of the buffer to write remains irrelevant.
> However, to verify that I wasn't wrong you can add some printk() in
> the path that starts with rxe_responder(). When flow reaches
> check_resource() and when QP is in ERROR state the function returns
> RESPST_COMPLETE. The next step in the state machine would be to call
> the do_complete() function.

Hello Moni,

Thanks for the feedback and the suggestion. I will check the ib_srpt code
further for possible race conditions. But after I had enabled the dynamic
debugging statements in the rdma_rxe driver I ran into something of which
I don't think that it is caused by the ib_srpt driver (with memory poisoning
enabled):

rdma_rxe:rxe_responder: rdma_rxe: qp#19 state = CLEANUP
rdma_rxe:rxe_responder: rdma_rxe: qp#19 state = DONE
general protection fault: 0000 [#1] PREEMPT SMP
CPU: 1 PID: 1385 Comm: kworker/1:26 Not tainted 4.15.0-rc4-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Workqueue: target_completion target_complete_ok_work [target_core_mod]
RIP: 0010:__lock_acquire+0xe4/0x13b0
RSP: 0018:ffff944ec40df9b0 EFLAGS: 00010002
RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8acf4b076748
RBP: ffff944ec40dfa80 R08: 0000000000000001 R09: ffffffffc061d1b7
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8acf5e8f2880 R14: 0000000000000001 R15: ffff8acf4b076748
FS:  0000000000000000(0000) GS:ffff8acf7fc8000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fffab80d000 CR3: 000000005fa0f005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
lock_acquire+0xac/0x230
_raw_spin_lock_irqsave+0x45/0x60
rxe_do_task+0x87/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_resp_queue_pkt+0x42/0x50 [rdma_rxe]
rxe_rcv+0x363/0x8b0 [rdma_rxe]
rxe_loopback+0x9/0x10 [rdma_rxe]
rxe_requester+0x6ea/0x1160 [rdma_rxe]
rxe_do_task+0x7c/0x100 [rdma_rxe]
rxe_run_task+0x16/0x30 [rdma_rxe]
rxe_post_send+0x2f0/0x550 [rdma_rxe]
srpt_queue_response+0x20c/0x400 [ib_srpt]
srpt_queue_status+0x28/0x40 [ib_srpt]
target_complete_ok_work+0x1ea/0x520 [target_core_mod]
process_one_work+0x211/0x6a0
worker_thread+0x38/0x3b0
kthread+0x124/0x140

(gdb) list *(rxe_do_task+0x87)
0xc1e7 is in rxe_do_task (drivers/infiniband/sw/rxe/rxe_task.c:90).
85              do {
86                      cont = 0;
87                      ret = task->func(task->arg);
88
89                      spin_lock_irqsave(&task->state_lock, flags);
90                      switch (task->state) {
91                      case TASK_STATE_BUSY:
92                              if (ret)
93                                      task->state = TASK_STATE_START;
94                              else

>From the disas rxe_do_task output:
   0x000000000000c1d9 <+121>:   callq  *0x78(%rbx)
   0x000000000000c1dc <+124>:   mov    %r12,%rdi
   0x000000000000c1df <+127>:   mov    %eax,%r14d
   0x000000000000c1e2 <+130>:   callq  0xc1e7 <rxe_do_task+135>

Does this perhaps mean that the rxe_qp structure can be freed while rxe_do_task()
is in progress? Please note that the ib_srpt driver only destroys a QP
(srpt_destroy_ch_ib() call in srpt_release_channel_work()) after all SCSI command
processing has finished (transport_deregister_session()).

Thanks,

Bart.��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f