On 07/02/2020 12:49, Chris Ellis wrote:
Sure, but that's where I'm trying to find out what's sensible. The box has 196GB memory, most of that in hugepages, 18 core Intel Skylake with HT on giving 36 cores and tonnes of SSD for storage. How would I turn that spec into a sensible number for max_connections? As that number grows, what contention points in postgres will start creaking (shared memory where the IPC happens?)What's "too much" for max_connections? What happens when you set it to
high? What factors affect that number?
When sizing max_connections you need to trade off how many connections your application will use at peak vs how much RAM and CPU you have.
Each connection is capable of allocating work_mem and has a stack etc.
As such you don't want max_connections to be able to run your system out of RAM.
The max_connections setting is an upper limit after which postgresql will reject connections. You don't really want to hit that limit, rather you want to keep the number of concurrent connections to a reasonable number (and have max_connections somewhere above that).
Each connection is a postgresql process, so active connections are competing for resources and even idle connections take up some RAM. Creating a new connection is launching a new process (and doing some setup) so it's relatively expensive.
Doing some sort of connection pooling is a good idea, especially
for web-apps that connect, do a few short queries and disconnect.
Django is probably doing a passable job at pooling already, so you
might want to see how many connections it's using under normal
load. Adding a dedicated pooler in between Django and PostgreSQL
would give you more flexibility and might be a good idea, but if
what Django is already doing is reasonable you may not need it.
What's a good number of active connections to aim for? It
probably depends on whether they tend to be CPU-bound or IO-bound,
but I've seen the rule of thumb of "around twice the number of CPU
cores" tossed around, and it's probably a decent place to start,
then run it under normal load and see how it behaves - cpu usage,
RAM, IO, request latency and throughput.
Cheers,
Steve