Re: On setting a lease across a cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 04, 2008 at 02:47:18PM -0500, J. Bruce Fields wrote:
> > > Then I started to wonder about the current split of functionality between
> > > fcntl_setlease, vfs_setlease and generic_setlease.  The check for no
> > > other process having this file open should be common to all filesystems.
> > > And it should be honoured by NFS (it shouldn't hand out a delegation if
> > > a local user has the file open), so it needs to be in vfs_setlease().
> 
> Of course, in the case of a (still unfortunately theoretical) cluster
> filesystem implementation, those checks will be insufficient on their
> own.

Oh yes, definitely insufficient, but also necessary.  Right now,
filesystems bypass them entirely.  And the last thing we want is
individual filesystems duplicating these tests ...

> Though the write-lease check especially:
> 
> 	if ((arg == F_WRLCK)
>             && ((atomic_read(&dentry->d_count) > 1)
> 	                    || (atomic_read(&inode->i_count) > 1)))
> 		goto out;
> 
> seems like a terrible hack, and gives some annoying false positives:
> 
> 	http://www.ussg.iu.edu/hypermail/linux/kernel/0711.1/2255.html

It is a terrible hack.  It was sufficient for the time, but it wasn't
supposed to be merged in that state ... I haven't had the courage to
figure out how to solve it properly yet.

> > > Now I'm worrying about the mechanics of calling out to a filesystem to
> > > perform a lock.  Obviously, a network filesystem is going to have to
> > > sleep waiting for a response from the fileserver, so that precludes the
> > > use of a spinlock held over the call to f_op->setlease.  I'm not keen on
> > > the idea of EXPORT_SYMBOL_GPL() a spinlock -- verifying locking is quite
> > > hard enough without worrying what a filesystem might be doing with it.
> 
> I'm confused as to what the locks are actually protecting.
> 
> If they're mainly just protecting the inode's lock list, for example,
> then we should let the filesystem call back to helpers in locks.c for
> any manipulations of that list and do the locking in those helpers.

But we need to ensure the lock list is always consistent for, eg,
/proc/locks.  We don't want locks to temporarily appear on the list when
they never get granted (due to an error elsewhere in the chain).

> > > vfs_setlease()
> > >    if (f_op->setlease())
> > >       res = f_op->setlease()
> > >       if (res)
> > >          return res;
> > >    lock_kernel()
> > >    generic_setlease()
> > >    unlock_kernel()
> 
> Why can't the filesystem call into generic_setlease() on its own?

Because (assuming we're rid of the BKL), fcntl_setlease() needs to
acquire the spinlock and hold it while generic_setlease() runs, so
generic_setlease() can't acquire the lock.

> Christoph Hellwig has suggested removing the second case and just doing
> 
> 	.setlease = generic_setlease
> 
> for all filesystems that should support leases; see the final patch in
> 
> 	git://linux-nfs.org/~bfields/linux.git server-cluster-lease-api

Yeah, but that doesn't work if we repurpose the setlease f_op.

-- 
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours.  We can't possibly take such
a retrograde step."
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux