Re: [PATCH 15/15] pnfs: layout roc code

Benny Halevy <bhalevy@xxxxxxxxxxx> · Sun, 26 Dec 2010 10:40:49 +0200

On 2010-12-23 02:19, Fred Isaman wrote:
> 
> On Dec 22, 2010, at 5:00 PM, Trond Myklebust wrote:
> 
>> On Tue, 2010-12-21 at 23:00 -0500, Fred Isaman wrote:
>>> A lsyout can request return-on-close.  How this interacts with the
>>> forgetful model of never sending LAYOUTRETURNS is a bit ambiguous.
>>> We forget any layouts marked roc, and wait for them to be completely
>>> forgotten before continuing with the close.  In addition, to compensate
>>> for races with any inflight LAYOUTGETs, and the fact that we do not get
>>> any layout stateid back from the server, we set the barrier to the worst
>>> case scenario of current_seqid + number of outstanding LAYOUTGETS.
>>>
>>> Signed-off-by: Fred Isaman <iisaman@xxxxxxxxxx>
>>> ---
>>> fs/nfs/inode.c         |    1 +
>>> fs/nfs/nfs4_fs.h       |    2 +-
>>> fs/nfs/nfs4proc.c      |   21 +++++++++++-
>>> fs/nfs/nfs4state.c     |    7 +++-
>>> fs/nfs/pnfs.c          |   83 ++++++++++++++++++++++++++++++++++++++++++++++++
>>> fs/nfs/pnfs.h          |   28 ++++++++++++++++
>>> include/linux/nfs_fs.h |    1 +
>>> 7 files changed, 138 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
>>> index 43a69da..c64bb40 100644
>>> diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h
>>> index 29d504d..90515de 100644
>>> --- a/include/linux/nfs_fs.h
>>> +++ b/include/linux/nfs_fs.h
>>> @@ -190,6 +190,7 @@ struct nfs_inode {
>>> 	struct rw_semaphore	rwsem;
>>>
>>> 	/* pNFS layout information */
>>> +	struct rpc_wait_queue lo_rpcwaitq;
>>> 	struct pnfs_layout_hdr *layout;
>>> #endif /* CONFIG_NFS_V4*/
>>> #ifdef CONFIG_NFS_FSCACHE
>>
>> I believe that I've asked this before. Why do we need a per-inode
>> rpc_wait_queue just to support pnfs? That's a significant expansion of
>> an already bloated structure.
>>
>> Can we please either make this a single per-filesystem wait queue, or
>> else possibly a pool of wait queues?
>>
>> Trond
> 
> This was introduced to avoid deadlocks that were occurring when we had a single wait queue.   However, the deadlocks I remember were due to a combination of the fact that, at the time, we handled EAGAIN errors of IO outside the RPC code, and we sent LAYOUTRETURN on such error.  Since we do neither now, I believe a single per-filesystem wait queue will suffice.  Anyone disagree?

The dead locks were also because we didn't use rpc wait queue but rather a thread based one.
Doing the serialization in the rpc prepare phase using a shared queue should cause dead locks.

Benny

> 
> Fred
> 
>>
>> -- 
>> Trond Myklebust
>> Linux NFS client maintainer
>>
>> NetApp
>> Trond.Myklebust@xxxxxxxxxx
>> www.netapp.com
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html