On Wed, Jul 1, 2015 at 5:06 PM, Craig James <cjames@xxxxxxxxxxxxxx> wrote: > We're buying a new server in the near future to replace an aging system. I'd > appreciate advice on the best SSD devices and RAID controller cards > available today. > > The database is about 750 GB. This is a "warehouse" server. We load supplier > catalogs throughout a typical work week, then on the weekend (after Q/A), > integrate the new supplier catalogs into our customer-visible "store", which > is then copied to a production server where customers see it. So the load is > mostly data loading, and essentially no OLTP. Typically there are fewer than > a dozen connections to Postgres. > > Linux 2.6.32 Upgrade to an OS with a later kernel, 3.11 at the lowest. 2.6.32 is broken from an IO perspective. It writes 2 to 4x more data than needed for normal operation. > Postgres 9.3 > Hardware: > 2 x INTEL WESTMERE 4C XEON 2.40GHZ > 12GB DDR3 ECC 1333MHz > 3WARE 9650SE-12ML with BBU > 12 x 1TB Hitachi 7200RPM SATA disks > RAID 1 (2 disks) > Linux partition > Swap partition > pg_xlog partition > RAID 10 (8 disks) > Postgres database partition > > We get 5000-7000 TPS from pgbench on this system. > > The new system will have at least as many CPUs, and probably a lot more > memory (196 GB). The database hasn't reached 1TB yet, but we'd like room to > grow, so we'd like a 2TB file system for Postgres. We'll start with the > latest versions of Linux and Postgres. Once your db is bigger than memory, the size of the memory isn't as important as the speed of the IO. Being able to read and write huge swathes of data becomes more important than memory size at that point. Being able to read 100MB/s versus being able to read 1,000MB/s is the difference between 10 minute queries and 10 hour queries on a reporting box. For sequential throughput, i.e. loading and retreiving with only one or two clients connected, you can throw more and more spinners at it. If you're gonna have enough clients connected to make the array go from sequential to random access, then you want to try and put SSDs in there if possible, but the cost / Gig is much higher than spinners. ZFS can use SSDs as cache, as can some newer RAID controllers, which represents a compromise between the two. If you go with spinners, with or without ssd cache, throw as many at the problem as you can. And run them in RAID-10 if you possibly can. RAID-5 or 6 are much slower, especially on spinners. > What about a RAID controller? Are RAID controllers even available for > PCI-Express SSD drives, or do we have to stick with SATA if we need a > battery-backed RAID controller? Or is software RAID sufficient for SSD > drives? Not that I know of. PCI-E drives act as their own drive. You could software RAID them I guess. Or do you mean are there PCI-E controlelrs for SATA SSD drives? Plenty of those. Many modern controllers don't use battery backed cache, they've gone to flash memory, which requires no battery to survive powerdown. I like LSI, 3Ware and Areca RAID HBAs. > Are spinning disks still a good choice for the pg_xlog partition and OS? Is > there any reason to get spinning disks at all, or is it better/simpler to > just put everything on SSD drives? Spinning drives are fine for xlog and OS. If you're logging to the same drive set as pg_xlog is using, you will hit the wall faster. SSDs are great, until you need more space. I'd rather have an 8TB xlog partition of spinners when setting up replication and xlog archiving than a 500GB xlog partition. 8TB sounds like a lot until you need to hold on to a week's worth of xlog files on a busy server. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance