Re: [bug report] blktests srp/002 hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 17, 2023 at 12:09:31PM -0500, Bob Pearson wrote:

 
> For qp#167 the call to srp_post_send() is followed by the rxe driver
> processing the send operation and generating a work completion which
> is posted to the send cq but there is never a following call to
> __srp_get_rx_iu() so the cqe is not received by srp and failure.

? I don't see this funcion in the kernel?  __srp_get_tx_iu ?
 
> I don't yet understand the logic of the srp driver to fix this but
> the problem is not in the rxe driver as far as I can tell.

It looks to me like __srp_get_tx_iu() is following the design pattern
where the send queue is only polled when it needs to allocate a new
send buffer - ie the send buffers are pre-allocated and cycle through
the queue.

So, it is not surprising this isn't being called if it is hung - the
hang is probably something that is preventing it from even wanting to
send, which is probably a receive side issue.

Followup back up from that point to isolate what is the missing
resouce to trigger send may bring some more clarity.

Alternatively if __srp_get_tx_iu() is failing then perhaps you've run
into an issue where it hit something rare and recovery does not work.

eg this kind of design pattern carries a subtle assumption that the rx
and send CQ are ordered together. Getting a rx CQ before a matching tx
CQ can trigger the unusual scenario where the send side runs out of
resources.

Jason



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux