On 6/20/13 4:32 PM, Josh Berkus wrote:
First, cc'ing Greg Smith to see if he can address this with the Fusion
folks so that they stop giving out a bad guide.
I'm working on a completely replacement of that guide, one that actually
gives out a full set of advice. Right now I'm between their product
cycles, I'm expecting new hardware again here soon.
The main thing that no one has done a good guide to is how to reduce SSD
flash cell wear in a PostgreSQL database. If there's anyone out there
who is burning through enough FusionIO cells at your site where a) you
care about wear, and b) you can spare a drive for testing at your site,
please contact me off list if you'd like to talk about that. I have a
project aiming at increased lifetimes that's specific to FusionIO
hardware, and it could use more testers. This involves a modified
PostgreSQL though. It's not for the squeamish or for a production
system. If I can't crash the kernel on the test server and that's fine,
move along, this will not help you for a while yet.
>>> They have a PostgreSQL setup guide from Fusion recommending the
>>> following settings:
>>> effective_io_concurrency=0
>>> bgwriter_lru_maxpages=0
I finally tracked down where this all came from. As a general advisory
on what their current guide addresses, validation of its settings
included things like testing with pgbench. It's a little known property
of the built-in pgbench test that the best TPS *throughput* numbers come
from turning the background writer off. That's why their guide suggests
doing that. The problem with that approach is that the background
writer is intended to improve *latency*, so measuring its impact on
throughput isn't the right approach.
Enabling or disabling the background writer doesn't have a large impact
on flash wear. That wasn't why turning it off was recommended. It came
out of the throughput improvement.
Similarly, effective_io_concurrency is untested by pgbench, its queries
aren't complicated enough. I would consider both of these settings
worse than the defaults, and far from optimal for their hardware.
random_page_cost=0.1
sequential_page_cost=0.1
As Shaun already commented on a bit--I agree with almost everything he
suggested--the situation with random vs. sequential I/O on this hardware
is not as simple as it's made out to be sometimes. Random I/O certainly
is not the same speed as sequential even on their hardware. Sequential
I/O is not always 10X as fast as traditional drives. Using values
closer to 1.0 as he suggested is a lot more sensible to me.
I also don't think random_page_cost = seq_page_cost is the best setting,
even though it did work out for Shaun. The hardware is fast enough that
you can make a lot of mistakes without really paying for them though,
and any query tuning like that is going to be workload dependent. It's
impossible to make a single recommendation here.
FusionIO drives are fast, but
they're not infinitely fast. My tests (and others) show they're about
1/2 the speed of memory, regarding IOPS.
This part needs a disclaimer on it. Memory speed varies significantly
between servers. The range just on new servers right now goes from
5GB/s to 40GB/s. There are a good number of single and dual socket
Intel systems where "1/2 the speed of memory" is about right. There are
systems where the ratio will be closer to 1:1 or 1:4 though.
--
Greg Smith 2ndQuadrant US greg@xxxxxxxxxxxxxxx Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance