Re: [PATCH v1] RDMA/core: Fix check_flush_dependency splat on addr_wq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Aug 29, 2022, at 12:45 PM, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> 
> On Fri, Aug 26, 2022 at 07:57:04PM +0000, Chuck Lever III wrote:
>> The connect APIs would be a place to start. In the meantime, though...
>> 
>> Two or three years ago I spent some effort to ensure that closing
>> an RDMA connection leaves a client-side RPC/RDMA transport with no
>> RDMA resources associated with it. It releases the CQs, QP, and all
>> the MRs. That makes initial connect and reconnect both behave exactly
>> the same, and guarantees that a reconnect does not get stuck with
>> an old CQ that is no longer working or a QP that is in TIMEWAIT.
>> 
>> However that does mean that substantial resource allocation is
>> done on every reconnect.
> 
> And if the resource allocations fail then what happens? The storage
> ULP retries forever and is effectively deadlocked?

The reconnection attempt fails, and any resources allocated during
that attempt are released. The ULP waits a bit then tries again
until it works or is interrupted.

A deadlock might occur if one of those allocations triggers
additional reclaim activity.


> How much allocation can you safely do under GFP_NOIO?

My naive take is that doing those allocations under NOIO would
help avoid recursion during memory exhaustion.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux