Re: [bug report] blktests srp/002 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 17, 2023 at 12:09:31PM -0500, Bob Pearson wrote:

 
> For qp#167 the call to srp_post_send() is followed by the rxe driver
> processing the send operation and generating a work completion which
> is posted to the send cq but there is never a following call to
> __srp_get_rx_iu() so the cqe is not received by srp and failure.

? I don't see this funcion in the kernel?  __srp_get_tx_iu ?
 
> I don't yet understand the logic of the srp driver to fix this but
> the problem is not in the rxe driver as far as I can tell.

It looks to me like __srp_get_tx_iu() is following the design pattern
where the send queue is only polled when it needs to allocate a new
send buffer - ie the send buffers are pre-allocated and cycle through
the queue.

So, it is not surprising this isn't being called if it is hung - the
hang is probably something that is preventing it from even wanting to
send, which is probably a receive side issue.

Followup back up from that point to isolate what is the missing
resouce to trigger send may bring some more clarity.

Alternatively if __srp_get_tx_iu() is failing then perhaps you've run
into an issue where it hit something rare and recovery does not work.

eg this kind of design pattern carries a subtle assumption that the rx
and send CQ are ordered together. Getting a rx CQ before a matching tx
CQ can trigger the unusual scenario where the send side runs out of
resources.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux