Seeking XFS tuning advice for PostgreSQL on SATA SSDs/Linux-md

Johannes Truschnigg <johannes.truschnigg@xxxxxxxxxxx> · Tue, 15 Apr 2014 14:23:07 +0200

Hi list,

we're building a postgres streaming replication slave that's supposed to 
pick up work if our primary pg cluster (with an all-flash FC SAN 
appliance as its backend store) goes down. We'll be using consumer-grade 
hardware for this setup, which I detail below:

o 2x Intel Xeon E5-2630L (24 threads total)
o 512GB DDR3 ECC RDIMM
o Intel C606-based Dual 4-Port SATA/SAS HBA (PCIID 8086:1d68)
o 6x Samsung 830 SSD with 512GB each, 25% reserved for HPA
o Debian GNU/Linux 7.x "Wheezy" + backports kernel (3.13+)
o PostgreSQL 9.0

If there's anything else that is of critical interest that I forgot to 
mention, hardware- or software-wise, please let me know.

When benchmarking the individual SSDs with fio (using the libaio 
backend), the IOPS we've seen were in the 30k-35k range overall for 4K 
block sizes. The host will be on the receiving end of a pg9.0 streaming 
replication cluster setup where the master handles ~50k IOPS peak, and 
I'm thinking what'd be a good approach to design the local storage stack 
(with availability in mind) in a way that has a chance to keep up with 
our flash-based FC SAN.

After digging through linux-raid archives, I think the most sensible 
approach are two-disk pairs in RAID1 that are concatenated via either 
LVM2 or md (leaning towards the latter, since I'd expect that to have a 
tad less overhead), and xfs on top of that resulting block device. That 
should yield roughly 1.2TB of usable space (we need a minimum of 900GB 
for the DB). With this setup, it should be possible to have up to 3 CPUs 
busy with handling I/O on the block side of things, which raises the 
question what'd be a sensible value to choose for xfs' Allocation Group 
Count/agcount.

I've been trying to find information on that myself, but what I managed 
to dig up is, at times, so old that it seems rather outlandish today - 
some sources on the web (from 2003), for example, say that one AG per 
4GB of underlying diskspace makes sense, which seems excessive for a 
1200GB volume.

I've experimented with mkfs.xfs (on top of LVM only; I don't know if it 
takes into account lower block layers and seen that it supposedly 
chooses to default to an agcount of 4, which seems insufficient given 
the max. bandwidth our setup should be able to provide.

Apart from that, is there any kind of advice you can share for tuning 
xfs to run postgres (9.0 initially, but we're planning to upgrade to 9.3 
or later eventually) on in 2014, especially performance-wise?

Thanks, regards:
- Johannes

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs