Re: RAID controllers for Postgresql on large setups

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



PFC schrieb:
PCI limits you to 133 MB/s (theoretical), actual speed being around 100-110 MB/s.

"Current" PCI 2.1+ implementations allow 533MB/s (32bit) to 1066MB/s (64bit) since 6-7 years ago or so.

For instance here I have a box with PCI, Giga Ethernet and a software RAID5 ; reading from the RAID5 goes to about 110 MB/s (actual disk bandwidth is closer to 250 but it's wasted) ; however when using the giga ethernet to copy a large file over a LAN, disk and ethernet have to share the PCI bus, so throughput falls to 50 MB/s. Crummy, eh ?

Sounds like a slow Giga Ethernet NIC...

Let me repeat this : at the current state of SATA drives, just TWO of them is enough to saturate a PCI bus. I'm speaking desktop SATA drives, not high-end SCSI ! (which is not necessarily faster for pure throughput anyway). Adding more drives will help random reads/writes but do nothing for throughput since the tiny PCI pipe is choking.

In my experience, SATA drives are very slow for typical database work (which is heavy on random writes). They often have very slow access times, bad or missing NCQ implementation (controllers / SANs as well) and while I am not very familiar with the protocol differences, they seem to add a hell of a lot more latency than even old U320 SCSI drives.

Sequential transfer performance is a nice indicator, but not very useful, since most serious RAID arrays will have bottlenecks other than the theoretical cumulated transfer rate of all the drives (from controller cache speed to SCSI bus to fibre channel). Thus, lower sequential transfer rate and lower access times scale much better.

Whether a SAN or just an external enclosure is 12disk enough to substain 5K inserts/updates per second on rows in the 30 to 90bytes territory? At 5K/second inserting/updating 100 Million records would take 5.5 hours. That is fairly reasonable if we can achieve. Faster would be better, but it depends on what it would cost to achieve.

5K/s inserts (with no indexes) are easy with PostgreSQL and typical (current) hardware. We are copying about 175K rows/s with our current server (Quad core Xeon 2.93GHz, lots of RAM, meagre performance SATA SAN with RAID-5 but 2GB writeback cache). Rows are around 570b each on average. Performance is CPU-bound with a typical number of indexes on the table and much lower than 175K/s though, for single row updates we get about 9K/s per thread (=5.6MB/s) and that's 100% CPU-bound on the server - if we had to max this out, we'd thus use several clients in parallel and/or collect inserts in text files and make bulk updates using COPY. The slow SAN isn't a problem now.

Our SATA SAN suffers greatly when reads are interspersed with writes, for that you want more spindles and faster disks.

To the OP I have 1 hearty recommendation: if you are using the RAID-functionality of the 2120, get rid of it. If you can wipe the disks, try using Linux software-RAID (yes, it's an admin's nightmare etc. but should give much better performance even though the 2120's plain SCSI won't be hot either) and then start tuning your PostgreSQL installation (there's much to gain here). Your setup looks decent otherwise for what you are trying to do (but you need a fast CPU) and your cheapest upgrade path would be a decent RAID controller or at least a decent non-RAID SCSI controller for software-RAID (at least 2 ports for 12 disks), although the plain PCI market is dead.

-mjy


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux