Re: [PATCH] pnfs/blocklayout: serialize GETDEVICEINFO calls

Trond Myklebust <trond.myklebust@xxxxxxxxxxxxxxx> · Fri, 26 Sep 2014 12:21:06 -0400

On Fri, Sep 26, 2014 at 11:48 AM, Christoph Hellwig <hch@xxxxxx> wrote:
> On Fri, Sep 26, 2014 at 10:29:34AM -0400, Trond Myklebust wrote:
>> It worries me that we're putting a mutex directly in the writeback
>> path. For small arrays, it might be acceptable, but what if you have a
>> block device with 1000s of disks on the back end?
>>
>> Is there no better way to fix this issue?
>
> Not without getting rid of the rpc_pipefs interface.  That is on my
> very long term TODO list, but it will require new userspace support.

Why is that? rpc_pipefs was designed to be message based, so it should
work quite well in a multi-threaded environment. We certainly don't
use mutexes around the gssd up/downcall, and the only reason for the
mutex in idmapd is to deal with the keyring upcall.

> Note that I'm actually worried about GETDEVICEINFO from the writeback
> path in general.  There is a lot that happens when we don't have
> a device in cache, including the need to open a block device for
> the block layout driver, which is a complex operation full of
> GFP_KERNEL allocation, or even a more complex scsi device scan
> for the object layout.  It's been on my more near term todo list
> to look into reproducers for deadlocks in this area which seem
> very possible, and then look into a fix for it; I can't really
> think of anything less drastic than refusing block or object layout
> I/O from memory reclaim if we don't have the device cached yet.
> The situation for file layouts seems less severe, so I'll need
> help from people more familar with to think about the situation there.

Agreed,

-- 
Trond Myklebust

Linux NFS client maintainer, PrimaryData

trond.myklebust@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html