nfs locking for cluster filesystems

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Apologies if this is late....

Please pull from the 'server-cluster-locking-api' branch at

	git://linux-nfs.org/~bfields/linux.git server-cluster-locking-api

for a series of patches which allow NFS to export the locking
functionality provided by filesystems which define their own ->lock()
method (cluster filesystems being the interesting case, and GFS2 the
first example).  There's also a little miscellaneous locks.c cleanup
along the way.

This has gone through an iteration or two with linux-fsdevel and sat in
-mm a couple weeks, and Trond has made a pass through it.

We've tested it by running cthon -l on ext3, gfs2, and nfs exports of
the two, in addition to doing some manual testing to ensure correct
handling of conflicts across multiple servers in a gfs2 cluster.

--b.

---

 fs/fuse/file.c                |    3 +-
 fs/gfs2/locking/dlm/plock.c   |  109 +++++++++++++++--
 fs/gfs2/locking/nolock/main.c |    8 +-
 fs/gfs2/ops_file.c            |   12 +-
 fs/lockd/svc4proc.c           |    6 +-
 fs/lockd/svclock.c            |  275 ++++++++++++++++++++++++++++++++++-------
 fs/lockd/svcproc.c            |    7 +-
 fs/lockd/svcsubs.c            |    2 +-
 fs/locks.c                    |  264 +++++++++++++++++++++++----------------
 fs/nfs/file.c                 |    7 +-
 fs/nfs/nfs4proc.c             |    1 +
 fs/nfsd/nfs4state.c           |   30 +++--
 include/linux/fcntl.h         |    4 +
 include/linux/fs.h            |    9 +-
 include/linux/lockd/lockd.h   |   14 ++-
 15 files changed, 546 insertions(+), 205 deletions(-)

commit 586759f03e2e9031ac5589912a51a909ed53c30a
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 14 16:37:25 2006 -0500

    gfs2: nfs lock support for gfs2
    
    Add NFS lock support to GFS2.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
    Acked-by: Steven Whitehouse <swhiteho@xxxxxxxxxx>

commit 1a8322b2b02071b0c7ac37a28357b93e6362f13e
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:27:06 2006 -0500

    lockd: add code to handle deferred lock requests
    
    Rewrite nlmsvc_lock() to use the asynchronous interface.
    
    As with testlock, we answer nlm requests in nlmsvc_lock by first looking up
    the block and then using the results we find in the block if B_QUEUED is
    set, and calling vfs_lock_file() otherwise.
    
    If this a new lock request and we get -EINPROGRESS return on a non-blocking
    request then we defer the request.
    
    Also modify nlmsvc_unlock() to call the filesystem method if appropriate.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit f812048020282fdfa9b72a6cf539c33b6df1fd07
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Dec 5 23:48:10 2006 -0500

    lockd: always preallocate block in nlmsvc_lock()
    
    Normally we could skip ever having to allocate a block in the case where
    the client asks for a non-blocking lock, or asks for a blocking lock that
    succeeds immediately.
    
    However we're going to want to always look up a block first in order to
    check whether we're revisiting a deferred lock call, and to be prepared to
    handle the case where the filesystem returns -EINPROGRESS--in that case we
    want to make sure the lock we've given the filesystem is the one embedded
    in the block that we'll use to track the deferred request.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 5ea0d75037b93baa453b4d326c6319968fe91cea
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:27:06 2006 -0500

    lockd: handle test_lock deferrals
    
    Rewrite nlmsvc_testlock() to use the new asynchronous interface: instead of
    immediately doing a posix_test_lock(), we first look for a matching block.
    If the subsequent test_lock returns anything other than -EINPROGRESS, we
    then remove the block we've found and return the results.
    
    If it returns -EINPROGRESS, then we defer the lock request.
    
    In the case where the block we find in the first step has B_QUEUED set,
    we bypass the vfs_test_lock entirely, instead using the block to decide how
    to respond:
    	with nlm_lck_denied if B_TIMED_OUT is set.
    	with nlm_granted if B_GOT_CALLBACK is set.
    	by dropping if neither B_TIMED_OUT nor B_GOT_CALLBACK is set
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 85f3f1b3f7a6197b51a2ab98d927517df730214c
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:27:06 2006 -0500

    lockd: pass cookie in nlmsvc_testlock
    
    Change NLM internal interface to pass more information for test lock; we
    need this to make sure the cookie information is pushed down to the place
    where we do request deferral, which is handled for testlock by the
    following patch.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 0e4ac9d93515b27fd7635332d73eae3192ed5d4e
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:26:51 2006 -0500

    lockd: handle fl_grant callbacks
    
    Add code to handle file system callback when the lock is finally granted.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 2b36f412ab6f2e5b64af9832b20eb7ef67d025b4
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:26:47 2006 -0500

    lockd: save lock state on deferral
    
    We need to keep some state for a pending asynchronous lock request, so this
    patch adds that state to struct nlm_block.
    
    This also adds a function which defers the request, by calling
    rqstp->rq_chandle.defer and storing the resulting deferred request in a
    nlm_block structure which we insert into lockd's global block list.  That
    new function isn't called yet, so it's dead code until a later patch.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 2beb6614f5e36c6165b704c167d82ef3e4ceaa0c
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Dec 5 23:31:28 2006 -0500

    locks: add fl_grant callback for asynchronous lock return
    
    Acquiring a lock on a cluster filesystem may require communication with
    remote hosts, and to avoid blocking lockd or nfsd threads during such
    communication, we allow the results to be returned asynchronously.
    
    When a ->lock() call needs to block, the file system will return
    -EINPROGRESS, and then later return the results with a call to the
    routine in the fl_grant field of the lock_manager_operations struct.
    
    This differs from the case when ->lock returns -EAGAIN to a blocking
    lock request; in that case, the filesystem calls fl_notify when the lock
    is granted, and the caller retries the original lock.  So while
    fl_notify is merely a hint to the caller that it should retry, fl_grant
    actually communicates the final result of the lock operation (with the
    lock already acquired in the succesful case).
    
    Therefore fl_grant takes a lock, a status and, for the test lock case, a
    conflicting lock.  We also allow fl_grant to return an error to the
    filesystem, to handle the case where the fl_grant requests arrives after
    the lock manager has already given up waiting for it.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit fd85b8170dabbf021987875ef7f903791f4f181e
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Tue Nov 28 16:26:41 2006 -0500

    nfsd4: Convert NFSv4 to new lock interface
    
    Convert NFSv4 to the new lock interface.  We don't define any callback for now,
    so we're not taking advantage of the asynchronous feature--that's less critical
    for the multi-threaded nfsd then it is for the single-threaded lockd.  But this
    does allow a cluster filesystems to export cluster-coherent locking to NFS.
    
    Note that it's cluster filesystems that are the issue--of the filesystems that
    define lock methods (nfs, cifs, etc.), most are not exportable by nfsd.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>

commit 9b9d2ab4154a42ea4a119f7d3e4e0288bfe0bb79
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Thu Jan 18 17:52:58 2007 -0500

    locks: add lock cancel command
    
    Lock managers need to be able to cancel pending lock requests.  In the case
    where the exported filesystem manages its own locks, it's not sufficient just
    to call posix_unblock_lock(); we need to let the filesystem know what's
    happening too.
    
    We do this by adding a new fcntl lock command: FL_CANCELLK.  Some day this
    might also be made available to userspace applications that could benefit from
    an asynchronous locking api.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 150b393456e5a23513cace286a019e87151e47f0
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Thu Jan 18 16:15:35 2007 -0500

    locks: allow {vfs,posix}_lock_file to return conflicting lock
    
    The nfsv4 protocol's lock operation, in the case of a conflict, returns
    information about the conflicting lock.
    
    It's unclear how clients can use this, so for now we're not going so far as to
    add a filesystem method that can return a conflicting lock, but we may as well
    return something in the local case when it's easy to.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 7723ec9777d9832849b76475b1a21a2872a40d20
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Thu Jan 18 15:08:55 2007 -0500

    locks: factor out generic/filesystem switch from setlock code
    
    Factor out the code that switches between generic and filesystem-specific lock
    methods; eventually we want to call this from lock managers (lockd and nfsd)
    too; currently they only call the generic methods.
    
    This patch does that for all the setlk code.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 3ee17abd14c728d4e0ca7a991c58f2250cb091af
Author: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Date:   Wed Feb 21 00:58:50 2007 -0500

    locks: factor out generic/filesystem switch from test_lock
    
    Factor out the code that switches between generic and filesystem-specific lock
    methods; eventually we want to call this from lock managers (lockd and nfsd)
    too; currently they only call the generic methods.
    
    This patch does that for test_lock.
    
    Note that this hasn't been necessary until recently, because the few
    filesystems that define ->lock() (nfs, cifs...) aren't exportable via NFS.
    However GFS (and, in the future, other cluster filesystems) need to implement
    their own locking to get cluster-coherent locking, and also want to be able to
    export locking to NFS (lockd and NFSv4).
    
    So we accomplish this by factoring out code such as this and exporting it for
    the use of lockd and nfsd.
    
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 9d6a8c5c213e34c475e72b245a8eb709258e968c
Author: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
Date:   Wed Feb 21 00:55:18 2007 -0500

    locks: give posix_test_lock same interface as ->lock
    
    posix_test_lock() and ->lock() do the same job but have gratuitously
    different interfaces.  Modify posix_test_lock() so the two agree,
    simplifying some code in the process.
    
    Signed-off-by: Marc Eshel <eshel@xxxxxxxxxxxxxxx>
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 70cc6487a4e08b8698c0e2ec935fb48d10490162
Author: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Date:   Thu Feb 22 18:48:53 2007 -0500

    locks: make ->lock release private data before returning in GETLK case
    
    The file_lock argument to ->lock is used to return the conflicting lock
    when found.  There's no reason for the filesystem to return any private
    information with this conflicting lock, but nfsv4 is.
    
    Fix nfsv4 client, and modify locks.c to stop calling fl_release_private
    for it in this case.
    
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>
    Cc: "Trond Myklebust" <Trond.Myklebust@xxxxxxxxxx>"

commit c2fa1b8a6c059dd08a802545fed3badc8df2adc1
Author: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Date:   Tue Feb 20 16:10:11 2007 -0500

    locks: create posix-to-flock helper functions
    
    Factor out a bit of messy code by creating posix-to-flock counterparts
    to the existing flock-to-posix helper functions.
    
    Cc: Christoph Hellwig <hch@xxxxxxxxxxxxx>
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>

commit 226a998dbf3c6f9b85f67d08a52c5a2143ed9d88
Author: J. Bruce Fields <bfields@xxxxxxxxxxxxxx>
Date:   Wed Feb 14 14:25:00 2007 -0500

    locks: trivial removal of unnecessary parentheses
    
    Remove some unnecessary parentheses.
    
    Signed-off-by: "J. Bruce Fields" <bfields@xxxxxxxxxxxxxx>
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux