Re: How many threads/cores Postgres can utilise?

Craig Ringer <craig@xxxxxxxxxxxxxxxxxxxxx> · Wed, 28 Apr 2010 20:14:10 +0800

On 28/04/10 18:25, Scott Marlowe wrote:
> On Wed, Apr 28, 2010 at 3:15 AM, Piotr Kublicki <Piotr.Kublicki@xxxxxxx> wrote:
>>
>> Dears,
>>
>> Sorry to be a royal pain, but I cannot find it anywhere in the
>> documentation: how many threads/CPU cores Postgres v. 8.4 can utilise?
>> We're thinking about installing Postgres on a virtual machine (RedHat 5
>> 64-bits), however not sure how many CPUs can be wisely assigned, without
>> wasting of resources. Can Postgres utilise multi-core/multi-threaded
>> architecture in a reasonably extent?
> 
> Like Craig mentioned, each connection uses one core basically, and the
> OS can use one or maybe two.  But that means that on even moderately
> busy servers 4 to 8 cores is very reasonable.  On modern hardware it's
> easy to get 6 or 8 cores pretty cheaply.  2P machines can have 12 or
> 16 cores for pretty cheap too.
> 
> Pgsql will get faster quickly as you increase parallel load to the
> number of cores you have (assuming enough memory bw to keep up) and
> slowly trail off as you add concurrent connections.  If you're likely
> to have hundreds of concurrent connections then adding more cores past
> 8 or 16 makes a lot of sense.

... if you expect them to all be actually doing work.

Often people with huge connection counts have mostly idle connections.
In this case they really need to use a connection pooler, and limit the
actual live connections to Pg its self to something more reasonable.
More cores never hurts, but if your workload isn't all that high you may
actually not need them.

Even if the connections *aren't* mostly idle and are all competing to
get work done, then as Scott says better throughput can usually be
achieved with many tens of connections and a connection pooler than with
many hundreds of connections. In this case, you not only need more CPU
cores but more disk I/O (more and faster disk spindles) to improve your
throughput.

It's worth realizing that Pg by its self doesn't scale all that well to
very large numbers of connections as it has fairly high per-connection
overheads (backend instance ram use, inter-backend communication, lock
management, etc). It works extremely well with server-side connection
poolers, though, so in practice it can be happily used with huge
connection counts.

(I suspect that most DBs that *do* scale really well to high connection
counts really just do internal connection pooling, separating connection
state housekeeping from query execution. This would be a nice-to-have
but probably isn't practical without a multi-threaded single-process
query execution core and that's not a nice notion with Pg's architecture.)

-- 
Craig Ringer

Tech-related writing: http://soapyfrogs.blogspot.com/

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general