On Thu, Jul 9, 2009 at 9:15 AM, Chris Barnes<compuguruchrisbarnes@xxxxxxxxxxx> wrote: > You assistance is appreciated. > > I have question regarding disk storage for postgres servers > > We are thinking long term about scalable storage and performance and would > like some advise or feedback about what other people are using. > > We would like to get as much performance from our file systems as possible. > > We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with > 15,000rpm drives. > > We use raid 1 for the centos operating system and the wal archive logs. > > The postgres database is on 5 drives configured as raid 5 with a global hot > spare. OK, two things jump out at me. One is that you aren't using a hardware RAID controller with battery backed cache, and you're using RAID-5. For most non-db applications, RAID-5 and no battery backed cache is just fine. For some DB applications like a reporting db or batch processing it's ok too. For DB applications that handle lots of small transactions, it's a really bad choice. Looking through the pgsql-performance archives, you'll see RAID-10 and HW RAID with battery backed cache mentioned over and over again, and for good reasons. RAID-10 is much more resilient, and a good HW RAID controller with battery backed cache can re-order writes into groups that are near each other on the same drive pair to make overall throughput higher, as well as making burst throughput to be higher as well by fsyncing immediately when you issue a write. I'm assuming you have 8 hard drives to play with. If that's the case, you can have a RAID-1 for the OS etc and a RAID-10 with 4 disks and two hot spares, OR a RAID-10 with 6 disks and no hot spares. As long as you pay close attention to your server and catch failed drives and replace them by hand that might work, but it really sits wrong with me. > We are curious about using SAN with fiber channel hba and if anyone else > uses this technology. Yep, again, check the pgsql-perform archives. Note that the level of complexity is much higher, as is the cost, and if you're talking about a dozen or two dozen drives, you're often much better off just having a good direct attached set of disks, either with an embedded RAID controller, or JBOD and using an internal RAID controller to handle them. The top of the line RAID controllers that can handle 24 or so disks run $1200 to $1500. Taking the cost of the drives out of the equation, I'm pretty sure any FC/SAN setup is gonna cost a LOT more than that single RAID card. I can buy a 16 drive 32TB DAS box for about $6k to $7k or so, plug it into a simple but fast SCSI controller ($400 tops) and be up in a few minutes. Setting up a new SAN is never that fast, easy, or cheap. OTOH, if you've got a dozen servers that need lots and lots of storage, a SAN will start making more sense since it makes managing lots of hard drives easier. > We would also like to know if people have preference to the level of raid > with/out striping. RAID-10, then RAID-10 again, then RAID-1. RAID-6 for really big reporting dbs where storage is more important than performance, and the data is mostly read anyways. RAID-5 is to be avoided, period. If you have 6 disks in a RAID-6 with no spare, you're better off than a RAID-5 with 5 disks and a spare, as in RAID-6 the "spare" is kind of already built in. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general