On Wed, Aug 06, 2008 at 09:06:48AM +1000, Neil Brown wrote: > On Tuesday August 5, david@xxxxxxxxxxxxx wrote: > > On Mon, Aug 04, 2008 at 02:19:12AM -0400, Chuck Lever wrote: > > > So, the JFFS2 locking problem is a garbage-collection issue. I'm not > > > sure this is the case with other file systems like XFS and OCFS2. My > > > impression was that XFS had a transaction logging deadlock, > > > > Just to clarify - XFS has a directory buffer lock deadlock. That is, > > while reading the contents of the directory buffer it is locked to > > prevent modifications from occurring while extracting the contents. > > Looking up an entry in the directory also requires the directory > > buffer lock (for the same reason), so calling the lookup while > > already holding the directory buffer lock (i.e from the filldir > > callback) will deadlock. > > How much cost would it be, do you think, to drop the lock across the > call to filldir? Then reclaim the lock, validate pointers etc against > a 'version' counter, and restart based on the current telldir cookie > if needed? The problem is that the dabuf is a temporary structure only valid for the length of a block read or transaction - it is built from buffers that are cached and provide persistence. Remember, XFS supports large, non-contiguous, directory blocks and so the directory code extremely complex in places. To do the above we need to pretty much tear down the dabuf to unlock everything before the filldir call, then build it in the lookup during the filldir call, then tear it down for the readdir to build it again, validate, etc.... Basically, what you suggest above still needs the same infrastructure as a proper shared locking scheme on the dabuf to work efficiently. Using a shared locking scheme gives much more benefit, because it will alow parallel directory traversals and lookups in *all cases*, not just NFS. Basically, I don't want to replace an _easily validated_ hack with some other nasty, non-trivial, disaster-waiting-to-happen hack that doesn't provide any benefit over the current hack.... > To me, that is the generic solution to allowing filldir to call > ->lookup. I'm just not sure what it costs to be constantly dropping > and reclaiming the lock in the normal case where ->lookup isn't being > called. Allowing filldir to call lookup requireѕ shared read lock semantics between readdir and lookup. I don't think any filesystem has that implemented, it can't be implemented with i_mutex involved, and it will be non-trivial to implement in the filesystems that need it. Normally the generic solution is the lowest common denominator solution - move the double buffering into the NFSD so everything works with the current exclusive locking semantics, and then provide another filldir+lookup for filesystem that are able to do something special to avoid the slower generic path. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html