Race between flush and write during an AIO+DIO+O_SYNC write?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

One of our (app) developers noticed that io_submit() takes a very long time to
return if the program initiates a write to a block device that's been opened in
O_SYNC and O_DIRECTIO mode.  We traced the slowness to blkdev_aio_write, which
seems to initiate a disk cache flush if __generic_file_aio_write returns a
positive value or -EIOCBQUEUED.  Usually we see -EIOCBQUEUED returned, which
triggers the flush, hence io_submit() stalls for a long time.  That doesn't
really feel like the intended usage pattern for aio.

This -EIOCBQUEUED case seems a little strange -- if an async io has been queued
(but not necessarily completed), why would we immediately issue a cache flush?
This seems like a setup for the flush racing against the write, which means
that the write could happen after the flush, which would be bad.

Jeff Moyer proposed a patchset last spring[1] that removed the -EIOCBQUEUED
case and deferred the flush issue to each filesystem's end_io handler.  Google
doesn't find any NAKs, but the patches don't seem to have gone anywhere.  Is
there a technical reason why this patches haven't gone anywhere?

Could one establish an end_io handler in blkdev_direct_IO so that async writes
to an O_SYNC+DIO block device will result in a blkdev_issue_flush before
aio_complete?  That would seem to fix the problem of the write and flush race.

--D

[1] http://oss.sgi.com/archives/xfs/2012-03/msg00082.html
    "fs: fix up AIO+DIO+O_SYNC to actually do the sync part"
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux