On Mon, Jan 11, 2010 at 11:20 AM, Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote: > Bob Dusek <redusek@xxxxxxxxx> wrote: >> Kevin Grittner <Kevin.Grittner@xxxxxxxxxxxx> wrote: >>> Bob Dusek <redusek@xxxxxxxxx> wrote: > >>> Anyway, my benchmarks tend to show that best throughput occurs at >>> about (CPU_count * 2) plus effective_spindle_count. Since you >>> seem to be fully cached, effective_spindle_count would be zero, >>> so I would expect performance to start to degrade when you have >>> more than about 32 sessions active. >>> >> That's a little disheartening for a single or dual CPU system. > > Not at all. You only have so many resources to keep busy at any one > moment. It is generally more efficient to only context switch > between as many processes as can keep those resources relatively > busy; otherwise valuable resources are spent switching among the > various processes rather than doing useful work. > > [Regular readers of this list might want to skip ahead while I run > through my usual "thought experiment on the topic. ;-) ] > > Imagine this hypothetical environment -- you have one CPU running > requests. There are no other resources to worry about and no > latency to the clients. Let's say that the requests take one second > each. The client suddenly has 100 requests to run. Assuming > context switching is free, you could submit all at once, and 100 > seconds later, you get 100 responses, with an average response time > of 100 seconds. Let's put a (again free) connection pooler in > there. You submit those 100 requests, but they are fed to the > database one at a time. You get one response back in one second, > the next in two seconds, the last in 100 seconds. No request took > any longer, and the average response time was 50.5 seconds -- almost > a 50% reduction. > > Now context switching is not free, and you had tens of thousands of > them per second. FYI, on an 8 or 16 core machine, 10k to 30k context switches per second aren't that much. If you're climbing past 100k you might want to look out. The more I read up on the 74xx CPUs and look at the numbers here the more I think it's just that this machine has X bandwidth and it's using it all up. You could put 1,000 cores in it, and it wouldn't go any faster. My guess is that a 4x6 core AMD machine or even a 2x6 Nehalem would be much faster at this job. Only way to tell is to run something like the stream benchmark and see how it scales, memory-wise, as you add cores to the benchmark. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance