Re: [rfc] fsync_range?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jamie Lokier <jamie@xxxxxxxxxxxxx> wrote on 01/21/2009 01:08:55 PM:

> For better or worse, I/O barriers and I/O flushes are the same thing
> in the Linux block layer.  I've argued for treating them distinctly,
> because there are different I/O scheduling opportunities around each
> of them, but there wasn't much interest.

It's hard to see how they could be combined -- flushing (waiting for the 
queue of writes to drain) is what you do -- at great performance cost -- 
when you don't have barriers available.  The point of a barrier is to 
avoid having the queue run dry.

But I don't suppose it matters for this discussion.

> > Or are we talking about the command to the device to harden all 
earlier 
> > writes (now) against a device power loss?  Does fsync() do that?
> 
> Ultimately that's what we're talking about, yes.  Imho fsync() should
> do that, because a userspace database/filesystem should have access to
> the same integrity guarantees as an in-kernel filesystem.  Linux
> fsync() doesn't always send the command - it's a bit unpredictable
> last time I looked.

Yes, it's the old performance vs integrity issue.  Drives long ago came 
out with features to defeat operating system integrity efforts, in 
exchange for performance, by doing write caching by default, ignoring 
explicit demands to write through, etc.  Obviously, some people want that, 
but I _have_ seen Linux developers escalate the battle for control of the 
disk drive.  I can just never remember where it stands at any moment.

But it doesn't matter in this discussion because my point is that if you 
accept the performance hit for integrity (I suppose we're saying that in 
current Linux, in some configurations, if a process does frequent fsyncs 
of a file, every process writing to every drive that file touches will 
slow to write-through speed), it will be about the same with 100 
fsync_ranges in quick succession as for 1.

> A little?  It's the difference between letting the disk schedule 100
> scattered writes itself, and forcing the disk to write them in the
> order you sent them from userspace, aside from the doubling the rate
> of device commands...

Again, in the scenario I'm talking about, all the writes were in the Linux 
I/O queue before the first fsync_range() (thanks to fadvises) , so this 
doesn't happen.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Storage Systems

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux