Re: New PG14 server won't start with >2GB shared_buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Tom,

Got a chance to work on this today.  Here is what I’m getting.  I used the same command that is used when the config has small memory.

$ /usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/<dir> --config-file=/etc/postgresql/14/appdb/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 —cluster_name=<name> --wal_level=logical --hot_standby=on --max_connections=150 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=12 --wal_log_hints=on
FATAL:  could not map anonymous shared memory: Cannot allocate memory
HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory, swap space, or huge pages. To reduce the request size (currently 21967716352 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers or max_connections.
LOG:  database system is shut down

/usr/lib/postgresql/14/bin/postgres -D /var/lib/postgresql/14/<dir> --config-file=/etc/postgresql/14/appdb/postgresql.conf --listen_addresses=0.0.0.0 --port=5432 —cluster_name=<name> --wal_level=logical --hot_standby=on --max_connections=150 --max_wal_senders=10 --max_prepared_transactions=0 --max_locks_per_transaction=64 --track_commit_timestamp=off --max_replication_slots=10 --max_worker_processes=12 --wal_log_hints=on
FATAL:  could not create shared memory segment: Cannot allocate memory
DETAIL:  Failed system call was shmget(key=2359323, size=21967716352, 03600).
HINT:  This error usually means that PostgreSQL's request for a shared memory segment exceeded your kernel's SHMALL parameter.  You might need to reconfigure the kernel with larger SHMALL.
The PostgreSQL documentation contains more information about shared memory configuration.
LOG:  database system is shut down

$ ulimit -v
unlimited

I’m not sure about the cgroup stuff.  I installed lscgroup and this is what is shows.  (BTW, we are using Patroni to control PostgreSQL, but it does not appear to be part of the problem since I’ve duplicated the errors without it):
$ lscgroup
devices:/
devices:/user.slice
devices:/system.slice
devices:/system.slice/irqbalance.service
devices:/system.slice/system-systemd\x2dfsck.slice
devices:/system.slice/syslog-ng.service
devices:/system.slice/systemd-networkd.service
devices:/system.slice/systemd-udevd.service
devices:/system.slice/cron.service
devices:/system.slice/oddjobd.service
devices:/system.slice/sys-fs-fuse-connections.mount
devices:/system.slice/sys-kernel-config.mount
devices:/system.slice/networkd-dispatcher.service
devices:/system.slice/sys-kernel-debug.mount
devices:/system.slice/certmonger.service
devices:/system.slice/accounts-daemon.service
devices:/system.slice/swapfile.swap
devices:/system.slice/numad.service
devices:/system.slice/systemd-journald.service
devices:/system.slice/unattended-upgrades.service
devices:/system.slice/sensu-client.service
devices:/system.slice/ssh.service
devices:/system.slice/dev-mqueue.mount
devices:/system.slice/rpc-gssd.service
devices:/system.slice/vnstat.service
devices:/system.slice/var-lib-postgresql.mount
devices:/system.slice/rpcbind.service
devices:/system.slice/chrony.service
devices:/system.slice/sssd.service
devices:/system.slice/proc-sys-fs-binfmt_misc.mount
devices:/system.slice/run-rpc_pipefs.mount
devices:/system.slice/autofs.service
devices:/system.slice/patroni.service
devices:/system.slice/consul.service
devices:/system.slice/telegraf.service
devices:/system.slice/dev-hugepages.mount
devices:/system.slice/dbus.service
devices:/system.slice/system-getty.slice
devices:/system.slice/systemd-logind.service
cpuset:/
cpu,cpuacct:/
cpu,cpuacct:/user.slice
cpu,cpuacct:/user.slice/user-5088.slice
cpu,cpuacct:/system.slice
memory:/
pids:/
pids:/user.slice
pids:/user.slice/user-5088.slice
pids:/user.slice/user-5088.slice/user@5088.service
pids:/user.slice/user-5088.slice/session-267.scope
pids:/user.slice/user-5088.slice/session-269.scope
pids:/user.slice/user-5088.slice/session-88.scope
pids:/system.slice
pids:/system.slice/irqbalance.service
pids:/system.slice/system-systemd\x2dfsck.slice
pids:/system.slice/syslog-ng.service
pids:/system.slice/systemd-networkd.service
pids:/system.slice/systemd-udevd.service
pids:/system.slice/cron.service
pids:/system.slice/oddjobd.service
pids:/system.slice/sys-fs-fuse-connections.mount
pids:/system.slice/sys-kernel-config.mount
pids:/system.slice/networkd-dispatcher.service
pids:/system.slice/sys-kernel-debug.mount
pids:/system.slice/certmonger.service
pids:/system.slice/accounts-daemon.service
pids:/system.slice/swapfile.swap
pids:/system.slice/numad.service
pids:/system.slice/systemd-journald.service
pids:/system.slice/unattended-upgrades.service
pids:/system.slice/sensu-client.service
pids:/system.slice/ssh.service
pids:/system.slice/dev-mqueue.mount
pids:/system.slice/rpc-gssd.service
pids:/system.slice/vnstat.service
pids:/system.slice/var-lib-postgresql.mount
pids:/system.slice/rpcbind.service
pids:/system.slice/chrony.service
pids:/system.slice/sssd.service
pids:/system.slice/proc-sys-fs-binfmt_misc.mount
pids:/system.slice/run-rpc_pipefs.mount
pids:/system.slice/autofs.service
pids:/system.slice/patroni.service
pids:/system.slice/consul.service
pids:/system.slice/telegraf.service
pids:/system.slice/dev-hugepages.mount
pids:/system.slice/dbus.service
pids:/system.slice/system-getty.slice
pids:/system.slice/system-getty.slice/getty@tty1.service
pids:/system.slice/systemd-logind.service
rdma:/
hugetlb:/
perf_event:/
blkio:/
freezer:/
net_cls,net_prio:/


Thanks,


Chris Hoover
Senior DBA
AWeber.com
Cell: (803) 528-2269
Email: chrish@xxxxxxxxxx



On Feb 25, 2023, at 11:35 AM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:

Evan Rempel <erempel@xxxxxxx> writes:
Bear in mind that if you are using systemd to start postgres, then these
user limits may not apply.

Yeah, it seems likely that the PG server is being started under
more-restrictive limits than what these manual reports suggest.
It would be useful to try logging in as the Postgres OS user and
manually starting the server -- just do "postgres &" and see what
happens.  (If it does start, "pg_ctl stop" can be used to shut it
down again, or you can manually send SIGTERM to the postmaster
process.)

I tried to reproduce the problem by intentionally setting
"ulimit -v" too small for my PG settings, and I got error messages
that were similar but not identical to what Chris reported.
(I think the sysv case failed at shmat() not shmget().)  So it's
probably not ulimit per se that's responsible.  But if the
server is being started under systemd, then I can entirely
believe that systemd has some poorly-documented feature that
sets additional limits for daemon processes.

I'm still wondering about cgroups, too.

regards, tom lane


[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux