Richard Yen wrote:
I figured that pg_xlog and data/base could both be on the FusionIO drive, since there would be no latency when there are no spindles.
(Rolls eyes) Please be careful about how much SSD Kool-Aid you drink, and be skeptical of vendor claims. They don't just make latency go away, particularly on heavy write workloads where the technology is at its weakest.
Also, random note, I'm seeing way too many FusionIO drive setups where people don't have any redundancy to cope with a drive failure, because the individual drives are so expensive they don't have more than one. Make sure that if you lose one of the drives, you won't have a massive data loss. Replication might help with that, if you can stand a little bit of data loss when the SSD dies. Not if--when. Even if you have a good one they don't last forever.
This means my pg_xlog partition should be (2 + checkpoint_completion_target) * checkpoint_segments + 1 = 41 files, or 656MB. Then, if there are more than 49 files, unneeded segment files will be deleted, but in this case all segment files are needed, so they never got deleted. Perhaps we should add in the docs that pg_xlog should be the size of the DB or larger?
Excessive write volume beyond the capacity of the hardware can end up delaying the normal checkpoint that would have cleaned up all the xlog files. There's a nasty spiral that can get into I've seen a couple of times in similar form to what you reported. The pg_xlog should never exceed the size computed by that formula for very long, but it can burst above its normal size limits for a little bit. This is already mentioned as possibility in the manual: "If, due to a short-term peak of log output rate, there are more than 3 * checkpoint_segments + 1 segment files, the unneeded segment files will be deleted instead of recycled until the system gets back under this limit." Autovacuum is an easy way to get the sort of activity needed to cause this problem, but I don't know if it's a necessary component to see the problem. You have to be in an unusual situation before the sum of the xlog files is anywhere close to the size of the database though.
-- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance