Re: Bad iostat numbers

Steve Atkins <steve@xxxxxxxxxxx> · Wed, 6 Dec 2006 08:19:18 -0800

On Dec 5, 2006, at 8:54 PM, Greg Smith wrote:

On Tue, 5 Dec 2006, Craig A. James wrote:

I'm not familiar with the inner details of software RAID, but the  
only circumstance I can see where things would get corrupted is if  
the RAID driver writes a LOT of blocks to one disk of the array  
before synchronizing the others...

You're talking about whether the discs in the RAID are kept  
consistant. While it's helpful with that, too, that's not the main  
reason a the battery-backed cache is so helpful.  When PostgreSQL  
writes to the WAL, it waits until that data has really been placed  
on the drive before it enters that update into the database.  In a  
normal situation, that means that you have to pause until the disk  
has physically written the blocks out, and that puts a fairly low  
upper limit on write performance that's based on how fast your  
drives rotate.  RAID 0, RAID 1, none of that will speed up the time  
it takes to complete a single synchronized WAL write.

When your controller has a battery-backed cache, it can immediately  
tell Postgres that the WAL write completed succesfully, while  
actually putting it on the disk later.  On my systems, this results  
in simple writes going 2-4X as fast as they do without a cache.   
Should there be a PC failure, as long as power is restored before  
the battery runs out that transaction will be preserved.

What Alex is rightly pointing out is that a software RAID approach  
doesn't have this feature.  In fact, in this area performance can  
be even worse under SW RAID than what you get from a single disk,  
because you may have to wait for multiple discs to spin to the  
correct position and write data out before you can consider the  
transaction complete.

So... the ideal might be a RAID1 controller with BBU for the WAL and  
something else, such as software RAID, for the main data array?

Cheers,
  Steve