Re: [PATCH] xprtrdma: make sure MRs are unmapped before freeing them

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Aug 15, 2020, at 1:45 AM, Dan Aloni <dan@xxxxxxxxxxxx> wrote:
> 
> On Fri, Aug 14, 2020 at 04:21:54PM -0400, Chuck Lever wrote:
>> 
>> 
>>> On Aug 14, 2020, at 3:10 PM, Dan Aloni <dan@xxxxxxxxxxxx> wrote:
>>> 
>>> On Fri, Aug 14, 2020 at 02:12:48PM -0400, Chuck Lever wrote:
>>>> Hi Dan-
>>>> 
>>>>> On Aug 14, 2020, at 1:37 PM, Dan Aloni <dan@xxxxxxxxxxxx> wrote:
>>>>> 
>>>>> It was observed that on disconnections, these unmaps don't occur. The
>>>>> relevant path is rpcrdma_mrs_destroy(), being called from
>>>>> rpcrdma_xprt_disconnect().
>>>> 
>>>> MRs are supposed to be unmapped right after they are used, so
>>>> during disconnect they should all be unmapped already. How often
>>>> do you see a DMA mapped MR in this code path? Do you have a
>>>> reproducer I can try?
>>> 
>>> These are not graceful disconnections but abnormal ones, where many large
>>> IOs are still in flight, while the remote server suddenly breaks the
>>> connection, the remote IP is still reachable but refusing to accept new
>>> connections only for a few seconds.
>> 
>> Ideally that's not supposed to matter. I'll see if I can reproduce
>> with my usual tricks.
>> 
>> Why is your server behaving this way?
> 
> It's a dedicated storage cluster under a specific testing scenario,
> implementing floating IPs.  Haven't tried, but maybe the same scenario
> can be reproduced with a standard single Linux NFSv3 server by fiddling
> with nfsd open ports.

Hi Dan, I was able to reproduce the DMA-map leak with a simple server-side
disconnect injection test. I'll try some root cause analysis tomorrow.


--
Chuck Lever






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux