This series aims to serialise unaligned direct IOs to an inode to avoid corruption caused by sub-block zeroing races. The previous approaches at the direct IO layer fail because for !DIO_LOCKING filesystems like XFS, there is no way we can track and serialise all the direct IOs to a given inode in a race free manner. While we can track them, we cannot close the races between mapping blocks and tracked IO completion occuring before subsequent tracking lookups without adding some kind of locking to the DIO layer. Hence for !DIO_LOCKING users, unaligned direct IO needs to be serialised at a higher layer. Because the xfs_file_aio_write() path is so twisted and difficult to follow, adding new locking cases to the code is difficult to verify that it is correct in all cases. Hence the series starts by cleaning up the code and splitting apart the direct IO and buffered IO paths before adding the unaligned direct IO detection and serialisation. The first patch fixes a sync write error handling bug - we should consider pushing that to .38. The next patches factor code that is common to write and splice into helpers. The direct and buffered IO paths are then separated out and the common write checks and bounds limiting is factored out into a helper. Finally, the serialisation of unaligned direct IOs is added by a big-hammer approach. That is, we take the i_mutex and XFS_IOLOCK_EXCL and hold them across the unaligned IO submission. This means that unaligned direct IO submission is serialised, and non-AIO DIO is serialised completely. For unaligned AIO DIO, this would only serialise the submission of the DIO, leaving the sub-block zeroing races open for unaligned writes into unwritten extents. To avoid this problem, we use xfs_ioend_wait() to ensure all AIO writes have completed before we submit the unaligned write. We do this wait holding the i_mutex so we serialise against other unaligned AIO as there is no need to serialise against aligned DIO. Version 2: - fix initial sync write error return fixup - add new patch to abstract locking from read/write path and remove the need for the need_i_mutex variable. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs