Re: SSI slows down over time

Ryan Johnson <ryan.johnson@xxxxxxxxxxxxxx> · Mon, 07 Apr 2014 10:38:52 -0400

On 05/04/2014 10:25 PM, Ryan Johnson wrote:
Hi all,

Disclaimer: this question probably belongs on the hackers list, but 
the instructions say you have to try somewhere else first... toss-up 
between this list and a bug report; list seemed more appropriate as a 
starting point. Happy to file a bug if that's more appropriate, though.

This is with pgsql-9.3.4, x86_64-linux, home-built with `./configure 
--prefix=...' and gcc-4.7.
TPC-C courtesy of oltpbenchmark.com. 12WH TPC-C, 24 clients.

I get a strange behavior across repeated runs: each 100-second run is 
a bit slower than the one preceding it, when run with SSI 
(SERIALIZABLE). Switching to SI (REPEATABLE_READ) removes the problem, 
so it's apparently not due to the database growing. The database is 
completely shut down (pg_ctl stop) between runs, but the data lives in 
tmpfs, so there's no I/O problem here. 64GB RAM, so no paging, either.

The plot thickens...

I just had a run die with an out of (tmpfs) disk space error; the 
pg_serial directory occupies 16GB, or 64825 segments (just under the 65k 
limit for SLRU). A bit of source diving confirms that this is the 
backing store for the OldSerXid SLRU that SSI uses. I'm not sure what 
would prevent SLRU space from being reclaimed, though, given that a 
complete, clean, database shut-down happens between every run. In 
theory, all SSI info can be forgotten any time there are no serializable 
transactions in the system.

I nuked the pgsql data directory and started over, and started firing 
off 30-second runs (with pg_ctl start/stop in between each). On about 
the sixth run, throughput dropped to ~200tps and the benchmark harness 
terminated with an assertion error. I didn't see anything interesting in 
the server logs (the database shut down normally), but the pg_serial 
directory had ballooned from ~100kB to 8GB.

I tried to repro, and a series of 30-second runs gave the following 
throughputs (tps):
*4615
3155 3149 3115 3206 3162 3069 3005 2978 2953 **308
2871 2876 2838 2853 2817 2768 2736 2782 2732 2833
2749 2675 2771 2700 2675 2682 2647 2572 2626 2567
*4394

That ** entry was the 8GB blow-up again. All files in the directory had 
been created at the same time (= not during a previous run), and 
persisted through the runs that followed. There was also a run where 
abort rates jumped through the roof (~40k aborts rather than the usual 
2000 or so), with a huge number of "out of shared memory" errors; 
apparently max_predicate_locks=2000 wasn't high enough.

The two * entries were produced by runs under SI, and confirm that the 
rest of the system has not been slowing down nearly as much as SSI. SI 
throughput dropped by 5% as the database quadrupled in size. SSI 
throughput dropped by 23% during the same interval. And this was 
actually one of the better sets of runs; I had a few last week that 
dipped below 1ktps.

I'm not sure what to make of this, thoughts?

Ryan

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance