On Fri, 2008-08-01 at 12:14 +1000, Neil Brown wrote: > It sounds to me like the core problem here is that the locking regime > imposed by the VFS simply isn't suitable for JFFS2 .. and a bunch of > other filesystems. Well, the VFS isn't the part that's recursing into the file system code and causing deadlock. It's only nfsd which is doing that. > This seems to suggest that the VFS should be changed. > Superficially, it seems that changing the locking rules so that the > callee takes i_mutex rather than the caller taking it would help and, > in the case of JFFS2, make f->sem redundant. Does that match your > understanding? Not entirely. I'm not sure I see how we could make such a change -- the VFS relies on i_mutex for a number of operations it performs for itself, so it's hard to see how we could do without using it there. I'm not entirely sure the approach would be sufficient for JFFS2, either. Both jffs2_readdir() and jffs2_lookup() are _still_ going to want to take the lock, whichever lock it is. And recursing back into the file system while the file system is already holding the lock is _still_ going to be considered harmful. I have even less faith that it would end up working for the various other file systems which are affected. Don't let me discourage you from trying, though :) > Clearly we need a short term solution too as we don't want to wait for > VFS locking rules to be renegotiated. The idea of a "lock is owned by > me" check is appealing to me as it is a small, localised change that > would easily be removed if/when the locking we "fixed" properly. > > In the JFFS2 case I imagine this taking the following form: > > - new field in jffs2_inode_info "struct task_struct *sem_owner", > initialised to NULL > - in jffs2_readdir after locking ->sem, set ->sem_owner to current. > - before unlocking ->sem, set ->sem_owner to NULL > - in jffs2_lookup, check if "dir_f->sem_owner == current" > If it does, set a flag reminding us not to drop the lock, > else mutex_lock(&dir_f->sem); You're describing the hack I posted at http://lists.infradead.org/pipermail/linux-mtd/2007-March/017663.html but couldn't bring myself to commit. And it (or some other workaround) would be needed in _every_ affected file system. I'm much happier with putting the workaround in nfsd code, and keeping it out of the individual file systems. -- David Woodhouse Open Source Technology Centre David.Woodhouse@xxxxxxxxx Intel Corporation -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html