On Mon, Sep 12, 2011 at 6:57 PM, <david@xxxxxxx> wrote: >> The "barrier" is the linux fs/block way of saying "these writes need >> to be on persistent media before I can depend on them". On typical >> spinning media disks, that means out of the disk cache (which is not >> persistent) and on platters. The way it assures that the writes are >> on "persistant media" is with a "flush cache" type of command. The >> "flush cache" is a close approximation to "make sure it's persistent". >> >> If your cache is battery backed, it is now persistent, and there is no >> need to "flush cache", hence the nobarrier option if you believe your >> cache is persistent. >> >> Now, make sure that even though your raid cache is persistent, your >> disks have cache in write-through mode, cause it would suck for your >> raid cache to "work", but believe the data is safely on disk and only >> find out that it was in the disks (small) cache, and you're raid is >> out of sync after an outage because of that... I believe most raid >> cards will handle that correctly for you automatically. > > if you don't have barriers enabled, the data may not get written out of main > memory to the battery backed memory on the card as the OS has no reason to > do the write out of the OS buffers now rather than later. It's not quite so simple. The "sync" calls (pick your flavour) is what tells the OS buffers they have to go out. The syscall (on a working FS) won't return until the write and data has reached the "device" safely, and is considered persistent. But in linux, a barrier is actually a "synchronization" point, not just a "flush cache"... It's a "guarantee everything up to now is persistent, I'm going to start counting on it". But depending on your card, drivers and yes, kernel version, that "barrier" is sometimes a "drain/block I/O queue, issue cache flush, wait, write specific data, flush, wait, open I/O queue". The double flush is because it needs to guarantee everything previous is good before it writes the "critical" piece, and then needs to guarantee that too. Now, on good raid hardware it's not usually that bad. And then, just to confuse people more, LVM up until 2.6.29 (so that includes all those RHEL5/CentOS5 installs out there which default to using LVM) didn't handle barriers, it just sort of threw them out as it came across them, meaning that you got the performance of nobarrier, even if you thought you were using barriers on poor raid hardware. > Every raid card I have seen has ignored the 'flush cache' type of command if > it has a battery and that battery is good, so you leave the barriers enabled > and the card still gives you great performance. XFS FAQ goes over much of it, starting at Q24: http://xfs.org/index.php/XFS_FAQ#Q:_What_is_the_problem_with_the_write_cache_on_journaled_filesystems.3F So, for pure performance, on a battery-backed controller, nobarrier is the recommended *performance* setting. But, to throw a wrench into the plan, what happens when during normal battery tests, your raid controller decides the battery is failing... of course, it's going to start screaming and send all your monitoring alarms off (you're monitoring that, right?), but have you thought to make sure that your FS is remounted with barriers at the first sign of battery trouble? a. -- Aidan Van Dyk Create like a god, aidan@xxxxxxxxxxx command like a king, http://www.highrise.ca/ work like a slave. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance