On Wed, 2008-09-10 at 09:37 -0400, Christoph Hellwig wrote: > On Wed, Sep 10, 2008 at 09:31:08AM -0400, Chris Mason wrote: > > Hello everyone, > > > > Is there a VFS level reason why we hold the directory mutex while we > > fsync the directory? I'm seeing a pretty dramatic improvement in > > directory fsync heavy workloads when I drop it during transaction > > commit, and it seems like this should be a safe optimization in all > > journaled the filesystems. > > > > This applies file fsync as well, the mutex doesn't protect us from all > > the possible places that might come in and make new dirty data. > > No, there's no reason for it. The whole ->fsync interface is a > complete mess. I've planned to do a few changes for a while: > Here's an example of the difference that dropping it during commit makes on btrfs. The workload is akpm's synctest, running 40 procs total, 10 procs per dir doing something similar to a mailserver that does fsync(dir). Single sata drive with barriers on: http://oss.oracle.com/~mason/seekwatcher/locking.png (ext4 comes in at about 230 seconds as well) > - drop the dentry argument, it can be derived from file->f_path.dentry, > except for the case of nfsd syncing directories. For that last case > we should be able to fake up a file struct either using dentry_open > or in the worst case on that stack. > - move the filemap_fdatawrite and filemap_fdatawait into ->fsync. > For any filesystems that wants to provider data vs metata ordering > we need to do the filemap_fdatawrite before updating i_size. > See fs/xfs/xfs_vnodeops.c:xfs_fsync() for what I mean. Btrfs would like this too, I have to filemap_fdatawait before the new i_size shows up in the metadata. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html