Greg Smith wrote: > Kevin Grittner wrote: > > I've seen this, too (with xfs). Our RAID controller, in spite of > > having BBU cache configured for writeback, waits for actual > > persistence on disk for write barriers (unlike for fsync). This > > does strike me as surprising to the point of bordering on qualifying > > as a bug. > Completely intentional, and documented at > http://xfs.org/index.php/XFS_FAQ#Q._Should_barriers_be_enabled_with_storage_which_has_a_persistent_write_cache.3F > > The issue is that XFS will actually send the full "flush your cache" > call to the controller, rather than just the usual fsync call, and that > eliminates the benefit of having a write cache there in the first > place. Good controllers respect that and flush their whole write cache > out. And ext4 has adopted the same mechanism. This is very much a good > thing from the perspective of database reliability for people with > regular hard drives who don't have a useful write cache on their cheap > hard drives. It allows them to keep the disk's write cache on for other > things, while still getting the proper cache flushes when the database > commits demand them. It does mean that everyone with a non-volatile > battery backed cache, via RAID card typically, needs to turn barriers > off manually. > > I've already warned on this list that PostgreSQL commit performance on > ext4 is going to appear really terrible to many people. If you > benchmark and don't recognize ext3 wasn't operating in a reliable mode > before, the performance drop now that ext4 is doing the right thing with > barriers looks impossibly bad. Well, this is depressing. Now that we finally have common battery-backed cache RAID controller cards, the file system developers have throw down another roadblock in ext4 and xfs. Do we need to document this? On another topic, I am a little unclear on how things behave when the drive is write-back. If the RAID controller card writes to the drive, but the data isn't on the platers, how does it know when it can discard that information from the BBU RAID cache? -- Bruce Momjian <bruce@xxxxxxxxxx> http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance