Re: What's the best hardver for PostgreSQL 8.1?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 08:35 AM 12/27/2005, Michael Stone wrote:
On Mon, Dec 26, 2005 at 10:11:00AM -0800, David Lang wrote:
what slows down raid 5 is that to modify a block you have to read blocks from all your drives to re-calculate the parity. this interleaving of reads and writes when all you are logicly doing is writes can really hurt. (this is why I asked the question that got us off on this tangent, when doing new writes to an array you don't have to read the blocks as they are blank, assuming your cacheing is enough so that you can write blocksize*n before the system starts actually writing the data)

Correct; there's no reason for the controller to read anything back if your write will fill a complete stripe. That's why I said that there isn't a "RAID 5 penalty" assuming you've got a reasonably fast controller and you're doing large sequential writes (or have enough cache that random writes can be batched as large sequential writes).

Sorry. A decade+ RWE in production with RAID 5 using controllers as bad as Adaptec and as good as Mylex, Chaparral, LSI Logic (including their Engino stuff), and Xyratex under 5 different OS's (Sun, Linux, M$, DEC, HP) on each of Oracle, SQL Server, DB2, mySQL, and pg shows that RAID 5 writes are slower than RAID 5 reads

With the one notable exception of the Mylex controller that was so good IBM bought Mylex to put them out of business.

Enough IO load, random or sequential, will cause the effect no matter how much cache you have or how fast the controller is.

The even bigger problem that everyone is ignoring here is that large RAID 5's spend increasingly larger percentages of their time with 1 failed HD in them. The math of having that many HDs operating simultaneously 24x7 makes it inevitable.

This means you are operating in degraded mode an increasingly larger percentage of the time under exactly the circumstance you least want to be. In addition, you are =one= HD failure from data loss on that array an increasingly larger percentage of the time under exactly the least circumstances you want to be.

RAID 5 is not a silver bullet.


 On Mon, Dec 26, 2005 at 06:04:40PM -0500, Alex Turner wrote:
Yes, but those blocks in RAID 10 are largely irrelevant as they are to independant disks. In RAID 5 you have to write parity to an 'active' drive that is part of the stripe.

Once again, this doesn't make any sense. Can you explain which parts of
a RAID 10 array are inactive?

I agree totally that the read+parity-calc+write in the worst case is totaly bad, which is why I alway recommend people should _never ever_ use RAID 5. In this day and age of large capacity chassis, and large capacity SATA drives, RAID 5 is totally inapropriate IMHO for _any_ application least of all databases.
I vote with Michael here. This is an extreme position to take that can't be followed under many circumstances ITRW.


So I've got a 14 drive chassis full of 300G SATA disks and need at least 3.5TB of data storage. In your mind the only possible solution is to buy another 14 drive chassis? Must be nice to never have a budget.

I think you mean an infinite budget. That's even assuming it's possible to get the HD's you need. I've had arrays that used all the space I could give them in 160 HD cabinets. Two 160 HD cabinets was neither within the budget nor going to perform well. I =had= to use RAID 5. RAID 10 was just not usage efficient enough.


Must be a hard sell if you've bought decent enough hardware that your benchmarks can't demonstrate a difference between a RAID 5 and a RAID 10 configuration on that chassis except in degraded mode (and the customer doesn't want to pay double for degraded mode performance)

I have =never= had this situation. RAID 10 latency is better than RAID 5 latency. RAID 10 write speed under heavy enough load, of any type, is faster than RAID 5 write speed under the same circumstances. RAID 10 robustness is better as well.

Problem is that sometimes budget limits or number of HDs needed limits mean you can't use RAID 10.


In reality I have yet to benchmark a system where RAID 5 on the same number of drives with 8 drives or less in a single array beat a RAID 10 with the same number of drives.

Well, those are frankly little arrays, probably on lousy controllers...
Nah. Regardless of controller I can take any RAID 5 and any RAID 10 built on the same HW under the same OS running the same DBMS and =guarantee= there is an IO load above which it can be shown that the RAID 10 will do writes faster than the RAID 5. The only exception in my career thus far has been the aforementioned Mylex controller.

OTOH, sometimes you have no choice but to "take the hit" and use RAID 5.


cheers,
Ron




[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux