----- Original Message ----- > From: "Graeme B. Bell" <graeme.bell@xxxxxxxx> > To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@xxxxxxx> > Cc: "Graeme B. Bell" <graeme.bell@xxxxxxxx>, "Steve Crawford" <scrawford@xxxxxxxxxxxxxxxxxxxx>, "Wes Vaske (wvaske)" > <wvaske@xxxxxxxxxx>, "pgsql-performance" <pgsql-performance@xxxxxxxxxxxxxx> > Sent: Tuesday, July 7, 2015 12:38:10 PM > Subject: Re: New server: SSD/RAID recommendations? > I am unsure about the performance side but, ZFS is generally very attractive to > me. > > Key advantages: > > 1) Checksumming and automatic fixing-of-broken-things on every file (not just > postgres pages, but your scripts, O/S, program files). > 2) Built-in lightweight compression (doesn't help with TOAST tables, in fact > may slow them down, but helpful for other things). This may actually be a net > negative for pg so maybe turn it off. > 3) ZRAID mirroring or ZRAID5/6. If you have trouble persuading someone that it's > safe to replace a RAID array with a single drive... you can use a couple of > NVMe SSDs with ZFS mirror or zraid, and get the same availability you'd get > from a RAID controller. Slightly better, arguably, since they claim to have > fixed the raid write-hole problem. > 4) filesystem snapshotting > > Despite the costs of checksumming etc., I suspect ZRAID running on a fast CPU > with multiple NVMe drives will outperform quite a lot of the alternatives, with > great data integrity guarantees. We are planing to have a test setup as well. For now I have single NVMe SSD on my test system: # lspci | grep NVM 85:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller 171X (rev 03) # mount | grep nvm /dev/nvme0n1p1 on /var/lib/pgsql/9.5 type ext4 (rw,noatime,nodiratime,data=ordered) and quite happy with it. We have write heavy workload on it to see when it will break. Postgres Performs very well. About x2.5 faster than with regular disks with a single client and almost linear with multiple clients (picture attached. On Y number of high level op/s our application does, X number of clients). The setup is used last 3 months. Looks promising but for production we need to to have disk size twice as big as on the test system. Until today, I was planning to use a RAID10 with a HW controller... Related to ZFS. We use ZFSonlinux and behaviour is not as good as with solaris. Let's re-phrase it: performance is unpredictable. We run READZ2 with 30x3TB disks. Tigran. > > Haven't built one yet. Hope to, later this year. Steve, I would love to know > more about how you're getting on with your NVMe disk in postgres! > > Graeme. > > On 07 Jul 2015, at 12:28, Mkrtchyan, Tigran <tigran.mkrtchyan@xxxxxxx> wrote: > >> Thanks for the Info. >> >> So if RAID controllers are not an option, what one should use to build >> big databases? LVM with xfs? BtrFs? Zfs? >> >> Tigran. >> >> ----- Original Message ----- >>> From: "Graeme B. Bell" <graeme.bell@xxxxxxxx> >>> To: "Steve Crawford" <scrawford@xxxxxxxxxxxxxxxxxxxx> >>> Cc: "Wes Vaske (wvaske)" <wvaske@xxxxxxxxxx>, "pgsql-performance" >>> <pgsql-performance@xxxxxxxxxxxxxx> >>> Sent: Tuesday, July 7, 2015 12:22:00 PM >>> Subject: Re: New server: SSD/RAID recommendations? >> >>> Completely agree with Steve. >>> >>> 1. Intel NVMe looks like the best bet if you have modern enough hardware for >>> NVMe. Otherwise e.g. S3700 mentioned elsewhere. >>> >>> 2. RAID controllers. >>> >>> We have e.g. 10-12 of these here and e.g. 25-30 SSDs, among various machines. >>> This might give people idea about where the risk lies in the path from disk to >>> CPU. >>> >>> We've had 2 RAID card failures in the last 12 months that nuked the array with >>> days of downtime, and 2 problems with batteries suddenly becoming useless or >>> suddenly reporting wildly varying temperatures/overheating. There may have been >>> other RAID problems I don't know about. >>> >>> Our IT dept were replacing Seagate HDDs last year at a rate of 2-3 per week (I >>> guess they have 100-200 disks?). We also have about 25-30 Hitachi/HGST HDDs. >>> >>> So by my estimates: >>> 30% annual problem rate with RAID controllers >>> 30-50% failure rate with Seagate HDDs (backblaze saw similar results) >>> 0% failure rate with HGST HDDs. >>> 0% failure in our SSDs. (to be fair, our one samsung SSD apparently has a bug >>> in TRIM under linux, which I'll need to investigate to see if we have been >>> affected by). >>> >>> also, RAID controllers aren't free - not just the money but also the management >>> of them (ever tried writing a complex install script that interacts work with >>> MegaCLI? It can be done but it's not much fun.). Just take a look at the >>> MegaCLI manual and ask yourself... is this even worth it (if you have a good >>> MTBF on an enterprise SSD). >>> >>> RAID was meant to be about ensuring availability of data. I have trouble >>> believing that these days.... >>> >>> Graeme Bell >>> >>> >>> On 06 Jul 2015, at 18:56, Steve Crawford <scrawford@xxxxxxxxxxxxxxxxxxxx> wrote: >>> >>>> >>>> 2. We don't typically have redundant electronic components in our servers. Sure, >>>> we have dual power supplies and dual NICs (though generally to handle external >>>> failures) and ECC-RAM but no hot-backup CPU or redundant RAM banks and...no >>>> backup RAID card. Intel Enterprise SSD already have power-fail protection so I >>>> don't need a RAID card to give me BBU. Given the MTBF of good enterprise SSD >>>> I'm left to wonder if placing a RAID card in front merely adds a new point of >>>> failure and scheduled-downtime-inducing hands-on maintenance (I'm looking at >>>> you, RAID backup battery). >>> >>> >>> >>> -- >>> Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) >>> To make changes to your subscription: >>> http://www.postgresql.org/mailpref/pgsql-performance > > > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance
Attachment:
pg-with-ssd.png
Description: PNG image
-- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance