On Fri, Dec 11, 2009 at 08:53:37AM +0000, Florian Weimer wrote: > * Dave Chinner: > > > On Thu, Dec 10, 2009 at 09:22:35AM +0000, Florian Weimer wrote: > >> I've got an odd performance issue. It seems that when fsync() is > >> called on a file, other processes block when they try to access it. > >> This is not merely due to I/O contention on the underlying block > >> device, it seems. > > > > The inode mutex is held across the ->fsync() method. If that takes a > > long time to run, then other processes will block trying to take the > > inode mutex. i.e. part of fsync serialises access to the inode. > > Is an inode lock required to read from the file? No usually - normally only for data writes and metadata modifications. However, some filesystems dirty objects even on read (e.g. changing atime) and so can serialise on other filesystem locks (e.g. ext3 journal lock) that is being held by the fsync. > >> Oracle reported a similar performance issue in the Berkeley DB JE > >> changelog. Is this really true? Are there any workarounds? (I'm > >> mainly interested in the situation on ext[34] and XFS.) > > > > For XFS, the ->fsync method blocks for as long as it takes to write > > a synchronous transaction (1 IO). ext4 looks like it writes the > > inode rather than doing a journal commit, so it should only need a > > single IO with the inode mutex held, too. I don't think these can be > > optimised any further. > > I'm not concerned with fsync latency per se. It's going to take a > while to write a few GBs scattered across the file. However, it's > annoying that read operations on the same file (which can't even see > the effect of the fsync operation) are blocked, some times for more > than two minutes. If they are blocking for that long then sysrq-w during that period will tell us exactly where in what filesystem they are blocking on.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html