On 10/7/15 9:18 AM, Gleb Natapov wrote: > Hello XFS developers, > > We are working on scylladb[1] database which is written using seastar[2] > - highly asynchronous C++ framework. The code uses aio heavily: no > synchronous operation is allowed at all by the framework otherwise > performance drops drastically. We noticed that the only mainstream FS > in Linux that takes aio seriously is XFS. So let me start by thanking > you guys for the great work! But unfortunately we also noticed that > sometimes io_submit() is executed synchronously even on XFS. > > Looking at the code I see two cases when this is happening: unaligned > IO and write past EOF. It looks like we hit both. For the first one we > make special afford to never issue unaligned IO and we use XFS_IOC_DIOINFO > to figure out what alignment should be, but it does not help. Looking at the > code though xfs_file_dio_aio_write() checks alignment against m_blockmask which > is set to be sbp->sb_blocksize - 1, so aio expects buffer to be aligned to > filesystem block size not values that DIOINFO returns. Is it intentional? How > should our code know what it should align buffers to? /* "unaligned" here means not aligned to a filesystem block */ if ((pos & mp->m_blockmask) || ((pos + count) & mp->m_blockmask)) unaligned_io = 1; It should be aligned to the filesystem block size. > Second one is harder. We do need to write past the end of a file, actually > most of our writes are like that, so it would have been great for XFS to > handle this case asynchronously. You didn't say what kernel you're on, but these: 9862f62 xfs: allow appending aio writes 7b7a866 direct-io: Implement generic deferred AIO completions hit kernel v3.15. However, we had a bug report about this, and Brian has sent a fix which has not yet been merged, see: [PATCH 1/2] xfs: always drain dio before extending aio write submission on this list last week. With those 3 patches, things should just work for you I think. -Eric > Currently we are working to work around > this by issuing truncate() (or fallocate()) on another thread and doing > aio on a main thread only after truncate() is complete. It seams to be > working, but is it guarantied that a thread issuing aio will never sleep > in this case (may be new file size value needs to hit the disk and it is > not guarantied that it will happen after truncate() returns, but before > aio call)? > > [2] http://www.scylladb.com/ > [1] http://www.seastar-project.org/ > > Thanks, > > -- > Gleb. > > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs > _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs