On Thu, Oct 08, 2015 at 08:21:58AM +0300, Avi Kivity wrote: > >>>I fixed something similar in ext4 at the time, FWIW. > >>Makes sense. > >> > >>Is there a way to relax this for reads? > >The above mostly only applies to writes. Reads don't modify data so > >racing unaligned reads against other reads won't given unexpected > >results and so aren't serialised. > > > >i.e. serialisation will only occur when: > > - unaligned write IO will serialise until sub-block zeroing > > is complete. > > - write IO extending EOF will serialis until post-EOF > > zeroing is complete > > > By "complete" here, do you mean that a call to truncate() returned, or that > its results reached the disk an unknown time later? > I think Brian already answered that one with: There are no such pitfalls as far as I'm aware. The entire AIO submission synchronization sequence triggers off an in-memory i_size check in xfs_file_aio_write_checks(). The in-memory i_size is updated in the truncate path (xfs_setattr_size()) via truncate_setsize(), so at that point the new size should be visible to subsequent AIO writers. > i could, immediately after truncating the file, extend it to a very large > size, and truncate it back just before the final fsync/close sequence. This > has downsides from the viewpoint of user support (why is the file so large > after a crash, what happens with backups) but is better than nothing. > > > - cached pages are found on the inode (i.e. mixing > > buffered/mmap access with direct IO). > > We don't do that. > > > - truncate/extent manipulation syscall is run > > Actually, we do call fallocate() ahead of io_submit() (in a worker thread, > in non-overlapping ranges) to optimize file layout and also in the belief > that it would reduce the amount of blocking io_submit() does. > > Should we serialize the fallocate() calls vs. io_submit() (on the same > file)? Were those fallocates a good idea in the first place? > > >All other DIO will be issued and run concurrently, reads and writes. > > > >Realistically, if you are care about performance (which obviously > >you are) then you do not do unaligned IO, and you try hard to > >minimise operations that extend the file... > > On SSDs, if you care about performance you avoid random writes, which cause > write amplification. So you do have to extend the file, unless you know its > size in advance, which we don't. > > Also, does "extend the file" here mean just the size, or extent allocation > as well? > > A final point is discoverability. There is no way to discover safe > alignment for reads and writes, and which operations block io_submit(), > except by asking here, which cannot be done at runtime. Interfaces that > provide a way to query these attributes are very important to us. As Brian pointed statfs() can be use to get f_bsize which is defined as "optimal transfer block size". -- Gleb. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs