Kevin Grittner wrote:
I don't know at the protocol level; I just know that write barriers
do *something* which causes our controllers to wait for actual disk
platter persistence, while fsync does not
It's in the docs now:
http://www.postgresql.org/docs/9.0/static/wal-reliability.html
FLUSH CACHE EXT is the ATAPI-6 call that filesystems use to enforce
barriers on that type of drive. Here's what the relevant portion of the
ATAPI spec says:
"This command is used by the host to request the device to flush the
write cache. If there is data in the write
cache, that data shall be written to the media.The BSY bit shall remain
set to one until all data has been
successfully written or an error occurs."
SAS systems have a similar call named SYNCHRONIZE CACHE.
The improvement I actually expect to arrive here first is a reliable
implementation of O_SYNC/O_DSYNC writes. Both SAS and SATA drives that
capable of doing Native Command Queueing support a write type called
"Force Unit Access", which is essentially just like a direct write that
cannot be cached. When we get more kernels with reliable sync writing
that maps under the hood to FUA, and can change wal_sync_method to use
them, the need to constantly call fsync for every write to the WAL will
go away. Then the "blow out the RAID cache when barriers are on"
behavior will only show up during checkpoint fsyncs, which will make
things a lot better (albeit still not ideal).
--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg@xxxxxxxxxxxxxxx www.2ndQuadrant.us
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance