Jamie Lokier <jamie@xxxxxxxxxxxxx> wrote on 01/21/2009 01:08:55 PM: > For better or worse, I/O barriers and I/O flushes are the same thing > in the Linux block layer. I've argued for treating them distinctly, > because there are different I/O scheduling opportunities around each > of them, but there wasn't much interest. It's hard to see how they could be combined -- flushing (waiting for the queue of writes to drain) is what you do -- at great performance cost -- when you don't have barriers available. The point of a barrier is to avoid having the queue run dry. But I don't suppose it matters for this discussion. > > Or are we talking about the command to the device to harden all earlier > > writes (now) against a device power loss? Does fsync() do that? > > Ultimately that's what we're talking about, yes. Imho fsync() should > do that, because a userspace database/filesystem should have access to > the same integrity guarantees as an in-kernel filesystem. Linux > fsync() doesn't always send the command - it's a bit unpredictable > last time I looked. Yes, it's the old performance vs integrity issue. Drives long ago came out with features to defeat operating system integrity efforts, in exchange for performance, by doing write caching by default, ignoring explicit demands to write through, etc. Obviously, some people want that, but I _have_ seen Linux developers escalate the battle for control of the disk drive. I can just never remember where it stands at any moment. But it doesn't matter in this discussion because my point is that if you accept the performance hit for integrity (I suppose we're saying that in current Linux, in some configurations, if a process does frequent fsyncs of a file, every process writing to every drive that file touches will slow to write-through speed), it will be about the same with 100 fsync_ranges in quick succession as for 1. > A little? It's the difference between letting the disk schedule 100 > scattered writes itself, and forcing the disk to write them in the > order you sent them from userspace, aside from the doubling the rate > of device commands... Again, in the scenario I'm talking about, all the writes were in the Linux I/O queue before the first fsync_range() (thanks to fadvises) , so this doesn't happen. -- Bryan Henderson IBM Almaden Research Center San Jose CA Storage Systems -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html