On Tue, 31 Mar 2009, Kevin Grittner wrote:
On Thu, Apr 17, 2008 at 7:26 PM, Greg Smith wrote:
On this benchmark 2.6.25 is the worst kernel yet:
I don't remember seeing a follow-up on this issue from last year.
Are there still any particular kernels to avoid based on this?
I just discovered something really fascinating here. The problem is
strictly limited to when you're connecting via Unix-domain sockets; use
TCP/IP instead, and it goes away.
To refresh everyone's memory here, I reported a problem to the LKML here:
http://lkml.org/lkml/2008/5/21/292 Got some patches and some kernel tweaks
for the scheduler but never a clear resolution for the cause, which kept
anybody from getting too excited about merging anything. Test results
comparing various tweaks on the hardware I'm still using now are at
http://lkml.org/lkml/2008/5/26/288
For example, here's kernel 2.6.25 running pgbench with 50 clients with a
Q6000 processor, demonstrating poor performance--I'd get >20K TPS here
with a pre-CFS kernel:
$ pgbench -S -t 4000 -c 50 -n pgbench
transaction type: SELECT only
scaling factor: 10
query mode: simple
number of clients: 50
number of transactions per client: 4000
number of transactions actually processed: 200000/200000
tps = 8288.047442 (including connections establishing)
tps = 8319.702195 (excluding connections establishing)
If I now execute exactly the same test, but using localhost, performance
returns to normal:
$ pgbench -S -t 4000 -c 50 -n -h localhost pgbench
transaction type: SELECT only
scaling factor: 10
query mode: simple
number of clients: 50
number of transactions per client: 4000
number of transactions actually processed: 200000/200000
tps = 17575.277771 (including connections establishing)
tps = 17724.651090 (excluding connections establishing)
That's 100% repeatable, I ran each test several times each way.
So the new summary here of what I've found is that if:
1) You're running Linux 2.6.23 or greater (confirmed in up to 2.6.26)
2) You connect over a Unix-domain socket
3) Your client count is relatively high (>8 clients/core)
You can expect your pgbench results to tank. Switch to connecting over
TCP/IP to localhost, and everything is fine; it's not quite as fast as the
pre-CFS kernels in some cases, in others it's faster though.
I haven't gotten to testing kernels newer than 2.6.26 yet, when I saw a
17K TPS result during one of my tests on 2.6.25 I screeched to a halt to
isolate this instead.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance