Thanks for the Info. So if RAID controllers are not an option, what one should use to build big databases? LVM with xfs? BtrFs? Zfs? Tigran. ----- Original Message ----- > From: "Graeme B. Bell" <graeme.bell@xxxxxxxx> > To: "Steve Crawford" <scrawford@xxxxxxxxxxxxxxxxxxxx> > Cc: "Wes Vaske (wvaske)" <wvaske@xxxxxxxxxx>, "pgsql-performance" <pgsql-performance@xxxxxxxxxxxxxx> > Sent: Tuesday, July 7, 2015 12:22:00 PM > Subject: Re: New server: SSD/RAID recommendations? > Completely agree with Steve. > > 1. Intel NVMe looks like the best bet if you have modern enough hardware for > NVMe. Otherwise e.g. S3700 mentioned elsewhere. > > 2. RAID controllers. > > We have e.g. 10-12 of these here and e.g. 25-30 SSDs, among various machines. > This might give people idea about where the risk lies in the path from disk to > CPU. > > We've had 2 RAID card failures in the last 12 months that nuked the array with > days of downtime, and 2 problems with batteries suddenly becoming useless or > suddenly reporting wildly varying temperatures/overheating. There may have been > other RAID problems I don't know about. > > Our IT dept were replacing Seagate HDDs last year at a rate of 2-3 per week (I > guess they have 100-200 disks?). We also have about 25-30 Hitachi/HGST HDDs. > > So by my estimates: > 30% annual problem rate with RAID controllers > 30-50% failure rate with Seagate HDDs (backblaze saw similar results) > 0% failure rate with HGST HDDs. > 0% failure in our SSDs. (to be fair, our one samsung SSD apparently has a bug > in TRIM under linux, which I'll need to investigate to see if we have been > affected by). > > also, RAID controllers aren't free - not just the money but also the management > of them (ever tried writing a complex install script that interacts work with > MegaCLI? It can be done but it's not much fun.). Just take a look at the > MegaCLI manual and ask yourself... is this even worth it (if you have a good > MTBF on an enterprise SSD). > > RAID was meant to be about ensuring availability of data. I have trouble > believing that these days.... > > Graeme Bell > > > On 06 Jul 2015, at 18:56, Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> wrote: > >> >> 2. We don't typically have redundant electronic components in our servers. Sure, >> we have dual power supplies and dual NICs (though generally to handle external >> failures) and ECC-RAM but no hot-backup CPU or redundant RAM banks and...no >> backup RAID card. Intel Enterprise SSD already have power-fail protection so I >> don't need a RAID card to give me BBU. Given the MTBF of good enterprise SSD >> I'm left to wonder if placing a RAID card in front merely adds a new point of >> failure and scheduled-downtime-inducing hands-on maintenance (I'm looking at >> you, RAID backup battery). > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance