Re: What exactly is postgres doing during INSERT/UPDATE ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Aug 30, 2009 at 1:36 PM, Mark Mielke<mark@xxxxxxxxxxxxxx> wrote:
> On 08/30/2009 11:40 AM, Merlin Moncure wrote:
>>
>> For random writes, raid 5 has to write a minimum of two drives, the
>> data being written and parity.  Raid 10 also has to write two drives
>> minimum.  A lot of people think parity is a big deal in terms of raid
>> 5 performance penalty, but I don't -- relative to the what's going on
>> in the drive, xor calculation costs (one of the fastest operations in
>> computing) are basically zero, and off-lined if you have a hardware
>> raid controller.
>>
>> I bet part of the problem with raid 5 is actually contention. since
>> your write to a stripe can conflict with other writes to a different
>> stripe.  The other problem with raid 5 that I see is that you don't
>> get very much extra protection -- it's pretty scary doing a rebuild
>> even with a hot spare (and then you should probably be doing raid 6).
>> On read performance RAID 10 wins all day long because more drives can
>> be involved.
>>
>
> In real life, with real life writes (i.e. not sequential from the start of
> the disk to the end of the disk), where the stripes on the disk being
> written are not already in RAM (to allow for XOR to be cheap), RAID 5 is
> horrible. I still recall naively playing with software RAID 5 on a three
> disk system and finding write performance to be 20% - 50% less than a single
> drive on its own.
>
> People need to realize that the cost of maintaining parity is not the XOR
> itself - XOR is cheap - the cost is having knowledge of all drives in the
> stripe in order to write the parity. This implies it is already in cache
> (requires a very large cache, or a very localized load such that the load
> all fits in cache), or it requires 1 or more reads before 2 or more writes.
> Latency is a killer here - latency is already the slowest part of the disk,
> so to effectively multiply latency x 2 has a huge impact.

This is not necessarily correct.  As long as the data you are writing
is less than the raid stripe size (say 64kb), then you only need the
old data for that stripe (which is stored on one disk only), the
parity (also stored on one disk only), and the data being written to
recalculate the parity.  A raid stripe is usually on one disk.  So a
raid 5 random write will only involve two drives if it's less than
stripe size (and three drives if it's up to 2x stripe size, etc).

IOW, if your stripe size is 64k:
64k written:
  raid 10: two writes
  raid 5: two writes, one read (but the read and one of the writes is
same physical location)
128k written
  raid 10: four writes
  raid 5: three writes, one read (but the read and one of the writes
is same physical location)
192k written
  raid 10: six writes
  raid 5: four writes, one read (but the read and one of the writes is
same physical location)

now, by 'same physical' location, that may mean that the drive head
has to move if the data is not in cache.

I realize that many raid 5 implementations tend to suck.  That said,
raid 5 should offer higher theoretical performance for writing than
raid 10, both for sequential and random. (many, many online
descriptions of raid get this wrong and stupidly blame the overhead of
parity calculation).  raid 10 wins on read all day long.  Of course,
on a typical system with lots of things going on, it gets a lot more
complicated...

(just for the record, I use raid 10 on my databases always) :-)

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux