Hi, Some comments... On Wed, 2010-04-14 at 22:36 +0200, Arnd Bergmann wrote: > From: Matthew Wilcox <willy@xxxxxxxxxxxxxxx> > > I've taken a patch originally written by Matthew Wilcox and > ported it to the current version. It seems that there were > originally concerns that this breaks NFS, but since Trond > has recently removed the BKL from NFS, my naive assumption > would be that it's all good now, despite not having tried to > understand what it does. [snip] > fs/locks.c | 110 ++++++++++++++++++++++++++++++++++++------------------------ > 1 files changed, 66 insertions(+), 44 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index ab24d49..87f1c60 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -140,9 +140,23 @@ int lease_break_time = 45; > #define for_each_lock(inode, lockp) \ > for (lockp = &inode->i_flock; *lockp != NULL; lockp = &(*lockp)->fl_next) > > +/* > + * Protects the two list heads below, plus the inode->i_flock list > + */ > +static DEFINE_SPINLOCK(file_lock_lock); > static LIST_HEAD(file_lock_list); > static LIST_HEAD(blocked_list); > > +static inline void lock_flocks(void) > +{ > + spin_lock(&file_lock_lock); > +} > + > +static inline void unlock_flocks(void) > +{ > + spin_unlock(&file_lock_lock); > +} > + > static struct kmem_cache *filelock_cache __read_mostly; > Why not just put the spin lock calls inline? [snip] > for_each_lock(inode, before) { > struct file_lock *fl = *before; > if (IS_POSIX(fl)) > @@ -767,8 +779,11 @@ static int flock_lock_file(struct file *filp, struct file_lock *request) > * If a higher-priority process was blocked on the old file lock, > * give it the opportunity to lock the file. > */ > - if (found) > + if (found) { > + unlock_flocks(); > cond_resched(); > + lock_flocks(); > + } > Use cond_resched_lock() here perhaps? [snip] > @@ -1341,7 +1358,7 @@ int fcntl_getlease(struct file *filp) > * The (input) flp->fl_lmops->fl_break function is required > * by break_lease(). > * > - * Called with kernel lock held. > + * Called with file_lock_lock held. > */ > int generic_setlease(struct file *filp, long arg, struct file_lock **flp) > { > @@ -1436,7 +1453,15 @@ out: > } > EXPORT_SYMBOL(generic_setlease); > > - /** > +static int __vfs_setlease(struct file *filp, long arg, struct file_lock **lease) > +{ > + if (filp->f_op && filp->f_op->setlease) > + return filp->f_op->setlease(filp, arg, lease); > + else > + return generic_setlease(filp, arg, lease); > +} > + > +/** > * vfs_setlease - sets a lease on an open file > * @filp: file pointer > * @arg: type of lease to obtain > @@ -1467,12 +1492,9 @@ int vfs_setlease(struct file *filp, long arg, struct file_lock **lease) > { > int error; > > - lock_kernel(); > - if (filp->f_op && filp->f_op->setlease) > - error = filp->f_op->setlease(filp, arg, lease); > - else > - error = generic_setlease(filp, arg, lease); > - unlock_kernel(); > + lock_flocks(); > + error = __vfs_setlease(filp, arg, lease); > + unlock_flocks(); > This looks to me like generic_setlease() or whatever fs specific ->setlease() there might be will be called under a spin lock. That doesn't look right to me. Rather than adding locking here, why not push the BKL down into generic_setlease() and ->setlease() first, and then eliminate it on a case by case basis later on? That is the pattern that has been followed elsewhere in the kernel. I might have some further comment on this, but thats as far as I've got at the moment, Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html