Re: MusicBrainz postgres performance issues

Tomas Vondra <tomas.vondra@xxxxxxxxxxxxxxx> · Mon, 16 Mar 2015 01:07:36 +0100

On 16.3.2015 00:55, michael@xxxxxxxxxxx wrote:
> Why is 500 connections "insane". We got 32 CPU with 96GB and 3000
> max connections, and we are doing fine, even when hitting our max
> concurrent connection peaks around 4500. At a previous site, we were
> using 2000 max connections on 24 CPU and 64GB RAM, with about 1500
> max concurrent connections. So I wouldn't be too hasty in saying more
> than 500 is asking for trouble. Just as long as you got your kernel
> resources set high enough to sustain it (SHMMAX, SHMALL, SEMMNI, and
> ulimits), and RAM for work_mem.

If all the connections are active at the same time (i.e. running
queries), they have to share the 32 cores somehow. Or I/O, if that's the
bottleneck.

In other words, you're not improving the throughput of the system,
you're merely increasing latencies. And it may easily happen that the
latency increase is not linear, but grows faster - because of locking,
context switches and other process-related management.

Imagine you have a query taking 1 second of CPU time. If you have 64
such queries running concurrently on 32 cores, each gets only 1/2 a CPU
and so takes >=2 seconds. With 500 queries, it's >=15 seconds per, etc.

If those queries are acquiring the same locks (e.g. updating the same
rows, or so), you can imagine what happens ...

Also, if part of the query required a certain amount of memory for part
of the plan, it now holds that memory for much longer too. That only
increases the change of OOM issues.

It may work fine when most of the connections are idle, but it makes
storms like this possible.

-- 
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance