Re: Are we in the ballpark?

Wayne Conrad <wayne@xxxxxxxxxxxx> · Wed, 02 Feb 2011 10:06:53 -0700

Greg, It's so nice to get a reply from the author of *the book*.  Thank 
you for taking the time to help us out.

On 02/01/11 18:30, Greg Smith wrote:
Do you not want any excitement in your life?

I've had database excitement enough to last a lifetime.  That's why I'm 
mending my ways.  Your book is the first step of our 12 step program.

8.4.7 is current; there are a lot of useful fixes to be had. See if you
can get a newer Debian package installed before you go live with this.

I'll look for 8.4.7, but we'll be switching to 9 before too long.

File system: XFS (nobarrier, noatime)

Should probably add "logbufs=8" in there too.

Will do.

work_mem = 192MB
wal_buffers = 8MB
random_page_cost = 1.0

That work_mem is a bit on the scary side of things, given how much
memory is allocated to other things. Just be careful you don't get a lot
of connections and run out of server RAM.

That's a leftover from the days when we *really* didn't know what we're 
doing (now we only *mostly* don't know what we're doing).  I'll set 
work_mem down to something less scary.

Might as well bump wal_buffers up to 16MB and be done with it.

Will do.

Setting random_page_cost to 1.0 is essentially telling the server the
entire database is cached in RAM. If that's not true, you don't want to
go quite that far in reducing it.

Oops, that was a typo.  We've set random_page_cost to 2, not 1.

With 8.4, you should be able to keep constraint_exclusion at its default
of 'partition' and have that work as expected; any particular reason you
forced it to always be 'on'?

See "we really didn't know what we were doing."  We'll leave 
constraint_exclusion at its default.

Bonnie++ (-f -n 0 -c 4)
$PGDATA/xlog (RAID1)
random seek: 369/sec
block out: 87 MB/sec
block in: 180 MB/sec
$PGDATA (RAID10, 12 drives)
random seek: 452
block out: 439 MB/sec
block in: 881 MB/sec

sysbench test of fsync (commit) rate:

$PGDATA/xlog (RAID1)
cache off: 29 req/sec
cache on: 9,342 req/sec
$PGDATA (RAID10, 12 drives)
cache off: 61 req/sec
cache on: 8,191 req/sec

That random seek rate is a little low for 12 drives, but that's probably
the limitations of the 3ware controller kicking in there. Your "cache
off" figures are really weird though; I'd expect those both to be around
100. Makes me wonder if something weird is happening in the controller,
or if there was a problem with your config when testing that. Not a big
deal, really--the cached numbers are normally going to be the important
ones--but it is odd.

I also thought the "cache off" figures were odd.  I expected something 
much closer to 120 req/sec (7200 rpm drives).  I probably won't 
investigate that with any vigor, since the cache-on numbers are OK.

Your pgbench SELECT numbers look fine, but particularly given that
commit oddity here I'd recommend running some of the standard TPC-B-like
tests, too, just to be completely sure there's no problem here. You
should get results that look like "Set 3: Longer ext3 tests" in the set
I've published to http://www.2ndquadrant.us/pgbench-results/index.htm
presuming you let those run for 10 minutes or so. The server those came
off of has less RAM and disks than yours, so you'll fit larger database
scales into memory before performance falls off, but that gives you
something to compare against.

TCB-B-like tests, will do.

Greg, Thanks a million.

    Wayne Conrad

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance