Thanks Joe, I will set these kernel parameters.
I also would like to highlight that the issue happened on SECONDARY. While the PRIMARY has less memory and computation in comparison to SECONDARY, not sure if there is anything wrong in the PgSQL.
PRIMARY: 48vCPUs & 48GB memory
SECONDARY: 64vCPUs & 64GB memory
I noticed a few things which do not sound tidy:
1. Total number of DBs are: 1860 (DB environment serves a product that has tenants - around 1100 tenants which means these many DBs are active)
: Is there any metric for optimal performance on the number of DBs we should have per instance? I would assume NO (and it should be purely based on the overall operations), but just a question out of curiosity.
: Is there any metric for optimal performance on the number of DBs we should have per instance? I would assume NO (and it should be purely based on the overall operations), but just a question out of curiosity.
2. max_connections is set to 10000.
I tried to reduce it to 4000 but was unable to do so (I tried this after reducing the max_connections in PRIMARY to 4000). This is the error:
FATAL: hot standby is not possible because max_connections = 4000 is a lower setting than on the master server (its value was 10000)
If I am clubbing multiple things, sorry for the clutter.
Regards
Siraj
On Tue, Oct 15, 2024 at 12:39 AM Joe Conway <mail@xxxxxxxxxxxxx> wrote:
On 10/14/24 14:37, Siraj G wrote:
> This is from the OS log (/var/log/kern.log):
>
> oom-
> kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.sli ce/system-postgresql.slice/postgresql@12-main.service,task=postgres,pid=2334587,uid=114
> 494 Oct 14 09:58:10 gce-k12-prod-as1-erp-pg-secondary kernel:
> [6905020.514569] Out of memory: Killed process 2334587 (postgres) total-
> vm:26349584kB, anon-rss:3464kB, file-rss:0kB, shmem-rs
> s:21813032kB, UID:114 pgtables:49024kB oom_score_adj:0
1. Do you happen to have swap disabled? If so, don't do that.
2. Does the postgres cgroup have memory.limit (cgroup v1) or memory.max
(cgroup v2) set?
3. If #2 answer is no, have you followed the documented guidance here
(in particular vm.overcommit_memory=2):
https://www.postgresql.org/docs/12/kernel-resources.html#LINUX-MEMORY-OVERCOMMIT
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com