Scott Carey wrote:
As long as fsync() works _properly_ which is true for any file system + disk combination with a damn (not HFS+ on OSX, not FAT, not a few other things), then it will tell the drive to flush its cache _before_ fsync() returns. There is NO REASON for a raid card to turn off a drive cache unless it does not trust the drive cache. In write-through mode, it should not return to the OS with a fsync, direct write, or other "the OS thinks this data is persisted now" call until it has flushed the disk cache. That does not mean it has to turn off the disk cache.
Assuming that the operating system will pass through fsync calls to flush data all the way to drive level in all situations is an extremely dangerous assumption. Most RAID controllers don't know how to force things out of the individual drive caches; that's why they turn off write caching on them. Few filesystems get the details right to handle individual drive cache flushing correctly. On Linux, XFS and ext4 are the only two with any expectation that will happen, and of those two ext4 is still pretty new and therefore should still be presumed to be buggy.
Please don't advise people about what is safe based on theoretical grounds here, in practice there are way too many bugs in the implementation of things like drive barriers to trust them most of the time. There is no substitute for a pull the plug test using something that looks for bad cache flushes, i.e. diskchecker.pl: http://brad.livejournal.com/2116715.html If you do that you'll discover you must turn off the individual drive caches when using a battery-backed RAID controller, and you can't ever trust barriers on ext3 because of bugs that were only fixed in ext4.
-- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance