Re: Linux kernel v4.15-rc4 and rdma_rxe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2018-01-03 at 07:13 +0200, Moni Shoua wrote:
> > Does this perhaps mean that the rxe_qp structure can be freed while rxe_do_task()
> > is in progress? Please note that the ib_srpt driver only destroys a QP
> > (srpt_destroy_ch_ib() call in srpt_release_channel_work()) after all SCSI command
> > processing has finished (transport_deregister_session()).
> 
> If I understand right you say that the system is hung when trying to
> take a lock in rxe_do_taks() (line 89). Is that right?
> Anyway, It's possible that you hit a bug related to destroying a QP.

Hello Moni,

The issues I had reported may be unrelated. BTW, this is what I saw appearing
in the system log a few minutes ago:

Jan  3 13:03:56 ubuntu-vm kernel: ib_srpt:srpt_close_ch: ib_srpt 192.168.122.76-18: queued zerolength write
Jan  3 13:03:56 ubuntu-vm kernel: rdma_rxe:rxe_completer: rdma_rxe: rxe_completer(): qp valid 1, state ERROR
[ ... ]
Jan  3 13:04:09 ubuntu-vm kernel: ib_srpt:srpt_disconnect_ch_sync: ib_srpt ch 192.168.122.76-18 state 3
[ ... ]
Jan  3 13:04:14 ubuntu-vm kernel: ib_srpt srpt_disconnect_ch_sync(192.168.122.76-18 state 3): still waiting ...

In other words, the ib_srpt driver had queued a zero-length write and changed
the QP state into ERROR but no completion was queued for that zero-length write.
The rdma_rxe log message was generated by the following code:

diff --git a/drivers/infiniband/sw/rxe/rxe_comp.c b/drivers/infiniband/sw/rxe/rxe_comp.c
index 6cdc40ed8a9f..f6c40edbddc6 100644
--- a/drivers/infiniband/sw/rxe/rxe_comp.c
+++ b/drivers/infiniband/sw/rxe/rxe_comp.c
@@ -550,6 +550,9 @@ int rxe_completer(void *arg)
 
 	if (!qp->valid || qp->req.state == QP_STATE_ERROR ||
 	    qp->req.state == QP_STATE_RESET) {
+		pr_debug("rxe_completer(): qp valid %d, state %s\n",
+			 qp->valid, qp->req.state == QP_STATE_ERROR ? "ERROR" :
+			 qp->req.state == QP_STATE_RESET ? "RESET" : "(?)");
 		rxe_drain_resp_pkts(qp, qp->valid &&
 				    qp->req.state == QP_STATE_ERROR);
 		goto exit;

Bart.��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux