Re: O_DIRECT and barriers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Aug 22, 2009 at 01:56:13AM +0100, Jamie Lokier wrote:
> AIX behaves like XFS according to documentation:
> 
>     [ http://publib.boulder.ibm.com/infocenter/systems/index.jsp?topic=/com.ibm.aix.genprogc/doc/genprogc/fileio.htm ]
> 
>     Direct I/O and Data I/O Integrity Completion
> 
>     Although direct I/O writes are done synchronously, they do not
>     provide synchronized I/O data integrity completion, as defined by
>     POSIX. Applications that need this feature should use O_DSYNC in
>     addition to O_DIRECT. O_DSYNC guarantees that all of the data and
>     enough of the metadata (for example, indirect blocks) have written
>     to the stable store to be able to retrieve the data after a system
>     crash. O_DIRECT only writes the data; it does not write the
>     metadata.
> 
> That's another reason to use O_DIRECT|O_DSYNC in moderately portable
> code.

...or use fsync() when they need to guarantee that data has been
atomically written, but not before.  This becomes critically important
if the application is writing into a sparse file, or writing into
uninitalized blocks that were allocated using fallocate(); otherwise,
with O_DIRECT|O_DSYNC, the file system would have to do a commit
operation after each write, which could be a performance disaster.

> > http://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics
> > 
> > Comments are welcome, either on the wiki's talk page, or directly to
> > me, or to the linux-fsdevel or linux-ext4.
> 
> I haven't read it yet.  One thing which comes to mind is it would be
> good to summarise what other OSes as well as Linux do with O_DIRECT
> w.r.t. data-finding metadata, preallocation, file extending, hole
> filling, unaligned access and what alignment is required, block
> devices vs. files and different filesystems and behaviour-modifying
> mount options, file open for buffered I/O on another descriptor, file
> has mapped pages, mlocked pages, and of course drive cache write
> through or not.

It's a wiki; contributions to define all of that is welcome.  :-)

We may want to carefully consider what we want to guarantee for all
time to application writers, and what we might want to leave open to
allow for performance optimizations by the kernel, though.

      	  	      		       	   	   - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux