Re: checkpoint and recovering process use too much memory

Justin Pryzby <pryzby@xxxxxxxxxxxxx> · Thu, 2 Nov 2017 21:21:28 -0500

On Fri, Nov 03, 2017 at 01:43:32AM +0000, tao tony wrote:
> I had an asynchronous steaming replication HA cluster.Each node had 64G memory.pg is 9.6.2 and deployed on centos 6.
> 
> Last month the database was killed by OS kernel for OOM,the checkpoint process was killed.

If you still have logs, was it killed during a large query?  Perhaps one using
a hash aggregate?

> I noticed checkpoint process occupied memory for more than 20GB，and it was growing everyday.In the hot-standby node,the recovering process occupied memory as big as checkpoint process.

"resident" RAM of a postgres subprocess is often just be the fraction of
shared_buffers it's read/written.  checkpointer must necessarily read all dirty
pages from s-b and write out to disk (by way of page cache), so that's why its
RSS is nearly 32GB.  And the recovery process is continuously writing into s-b.

> Now In the standby node,checkpoint and recovering process  used more then 50GB memory as below,and I worried someday the cluster would be killed by OS again.
> 
>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 167158 postgres  20   0 34.9g  25g  25g S  0.0 40.4  46:36.86 postgres: startup process   recovering 00000004000008550000004B
> 167162 postgres  20   0 34.9g  25g  25g S  0.0 40.2  17:58.38 postgres: checkpointer process
> 
> shared_buffers = 32GB

Also, what is work_mem ?

Justin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general