Re: [rfc] fsync_range?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Bryan Henderson wrote:
> >- although that will cause unnecessary I/O barriers, one per
> >fsync_range().
> 
> What do I/O barriers have to do with it?  An I/O barrier says, "don't 
> harden later writes before these have hardened," whereas fsync_range() 
> says, "harden these writes now."  Does Linux these days send an I/O 
> barrier to the block subsystem and/or device as part of fsync()?

For better or worse, I/O barriers and I/O flushes are the same thing
in the Linux block layer.  I've argued for treating them distinctly,
because there are different I/O scheduling opportunities around each
of them, but there wasn't much interest.

> Or are we talking about the command to the device to harden all earlier 
> writes (now) against a device power loss?  Does fsync() do that?

Ultimately that's what we're talking about, yes.  Imho fsync() should
do that, because a userspace database/filesystem should have access to
the same integrity guarantees as an in-kernel filesystem.  Linux
fsync() doesn't always send the command - it's a bit unpredictable
last time I looked.

There are other opinions.  MacOSX fsync() doesn't - because it has an
fcntl() which is a stronger version of fsync() documented for that
case.  They preferred reduced integrity of fsync() to keep benchmarks
on par with other OSes which don't send the command.

Interestingly, Windows _does_ have the option to send the command to
the device, controlled by userspace.  If you set the Windows
equivalents to O_DSYNC and O_DIRECT at the same time, then calls to
the Windows equivalent to fdatasync() cause an I/O barrier command to
be sent to the disk if necessary.  The Windows documentation even
explain the different between OS caching and device caching and when
each one occurs, too.  Wow - it looks like Windows (later versions)
has the edge in doing the right thing here for quite some time...

   http://www.microsoft.com/sql/alwayson/storage-requirements.mspx
   http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx

> Either way, I can see that multiple fsync_ranges's in a row would be a 
> little worse than just one, but it's pretty bad problem anyway, so I don't 
> know if you could tell the difference.

A little?  It's the difference between letting the disk schedule 100
scattered writes itself, and forcing the disk to write them in the
order you sent them from userspace, aside from the doubling the rate
of device commands...

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux