Re: [PATCH] VFS: Unlink should revoke all outstanding leases on file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




J. Bruce Fields :
> On Thu, May 20, 2010 at 05:46:06PM +0800, Mi Jinlong wrote:
>> J. Bruce Fields :
>>> I don't know of any existing lock that does exactly what we want.
>>>
>>> Somebody at citi worked on a better lease implementation for a while,
>>> but I don't think we ever really got it right; the last version I can
>>> find is here:
>>>
>>> 	git://linux-nfs.org/~bfields/linux-topics.git leases
>>   When reading the code of the git, i found a patch which try to fix
>>   the lease's problem, but only for unlink.
> 
> It's 8 patches together:
> 
>>   commit id: d5a08e556116c66fb60a448f805a40bf54314634
>>         msg: "leases: break file leases on unlink."
>>
>>   In this patch, it seems only add break_lease() and some other functions
>>   but seems don't avoid the problem of race.
> 
> Look again: break_lease() is there, but there's also a break_lease_end()
> after the unlink.

  Thanks. That's important.

> 
>> Or there is some different 
>>   at break_lease() with the community's kernel.
>>
>>   Can you give me some message about the new lease? Thanks.
> 
> So the 8 patches at that branch are:
> 	leases: introduce per-inode lease enabling/disabling
> 	VFS: clean up extra dereferences in do_unlinkat()
> 	leases: break file leases on unlink.
> 	leases: break file leases on rename.
> 	leases: break leases on chown.
> 	VFS: refactor sys_fchmod and sys_fchmodat
> 	leases: break leases on chmod.
> 	leases: break leases on link.
> 
> Like I say, I don't think they're correct or I'd just copy them all to
> the list.  But maybe the comment on the first patch (appended) is
> useful.

  As reading the first patch ("leases: introduce ... "), it's really useful.
  But, as Jamie Lokier said "new lease semantics should use new userspace API.",
  is there some new lease under development ?

  And, IMO, this problem we discussing is serious that should be fix ASAP.
  Can we fix this problem refer to this solutions? 

thanks, 
Mi Jinlong

> 
> --b.
> 
> leases: introduce per-inode lease enabling/disabling
> 
> The current file lease implementation is inadequate (for the purposes of
> nfs, and, we believe, for the purposes of Samba), in at least two ways:
> 
> 	- Leases are broken only conflicting opens; but both nfsv4
> 	  delegations and (we're told) Windows op locks actually require
> 	  that leases be broken on any operation that changes file
> 	  metadata--including unlink, link, rename, chmod, and chown.
> 
> 	- The internal kernel api used for lease-breaking is inherently
> 	  racy, consisting as it does of a single break_lease() call.
> 	  (Consider this scenario: a file is not currently open and is
> 	  about to be unlinked.  During unlink processing, a lookup is
> 	  done, and break_lease() is called.  After the break_lease(),
> 	  but before the unlink completes, another user opens the file
> 	  and gets a read lease.  The unlink then completes, but the
> 	  other user thinks their read lease is still valid.  This
> 	  situation would be avoided if lease-granting for the inode
> 	  were disabled for the duration of the unlink.)
> 
> We're primarily interested in the case of read leases for now.  (Write
> leases, which also must be broken on *access* to a file, are more
> difficult to get completely right, and aren't used by the current nfs
> server.)
> 
> Fixing the second problem requires replacing break_lease() by a pair of
> calls, here called break_lease() and break_lease_end(), between which
> new leases are temporarily prohibited.
> 
> We want to implement that temporary prohibition in a simple way that has
> low impact in common (uncontended) cases.
> 
> This patch adds a field, i_leasecount, which provides mutual exclusion
> between inode-modifying operations and read leases in the same way the
> i_writecount provides mutual exclusion between write opens and execs:
> when i_leasecount is positive, it counts the number of leases on the
> given inode, and when it's negative it counts the number of operations
> which want leases temporarily disabled.  This allows selective
> enabling/disabling of leases on a per-inode basis.
> 
> To that end, the functions leases_get_access() and leases_put_access()
> are used when a lease is granted and returned, respectively.  The
> functions leases_deny_access() and leases_allow_access() are used to
> prevent races between breaking-with-FMODE_WRITE and write-lease-granting
> for the entire duration of a file operation.  Currently, leases are
> broken only when a file is opened or truncated; these functions will
> allow leases to be broken on things like unlink and rename as well.
> NFSv4 implements delegations using leases, and needs its delegations to
> be revoked on unlinks, renames, chowns, etc.
> 
> Note that this patch changes break_lease() and __break_lease(), such
> that when they are called with FMODE_WRITE and return successfully, they
> will leave leases disabled on the inode in question, and the caller must
> eventually call break_lease_end() to re-enable leasing.  As alluded to
> in the scenario above, this behavior isn't necessary when breaking
> without FMODE_WRITE: existing and new read-leases wouldn't need to be
> revoked or blocked; and a write-lease-granting setlease won't race the
> break_lease() because the latter is presumed to have been preceded by
> something like a dget() on the dentry in question (where d_count or
> i_count > 1 blocks write-lease-granting).
> 
> This patch also closes a small existing open/lease race: a lease-related
> race exists between the time that outstanding leases are broken (by
> may_open()) and the time that, e.g., O_RDWR or O_WRONLY are reflected in
> the inode's i_writecount variable (which will prevent subsequent
> lease-granting setlease calls).  Conceivably, a read lease could be
> granted in the interim.
> 
> To deal with this, may_open() is modified so that, on success and when
> called with FMODE_WRITE, it will return with lease-granting disabled for
> the inode in question.  do_filp_open() is modified so that leasing is
> re-enabled once everything is finished.  Analogous changes are made on
> truncation.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux