In a heavily used production database prod, running a `psql prod` takes seconds. Once the connection is established, performance of queries is fine. There are ~2 new usually short-lived connections / second, and a couple of long-running analytical queries. Connecting to other databases on the same PostgreSQL server is fast.
Here is what I tried:
-bash-4.2$ date; psql prod -c "SELECT backend_start, xact_start, query_start FROM pg_stat_activity WHERE pid = (SELECT pg_backend_pid());"; date
Wed Jun 16 09:01:51 EDT 2021
backend_start | xact_start | query_start
-------------------------------+-------------------------------+-------------------------------
2021-06-16 09:01:51.596197-04 | 2021-06-16 09:01:57.979979-04 | 2021-06-16 09:01:57.979979-04
(1 row)
Wed Jun 16 09:01:57 EDT 2021
So the backend started quickly, but it took 6 seconds to start running the query.
I used
ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}'
to identify the PID the most recent pg backend started by myself in startup mode.
I straced PG backend after running psql:
$ sudo strace -tt -T -v -p $(ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}')
(...)
08:47:28.870917 read(6, "|\25\3\0\0|\v\335$!\1\0\364\0\20\1\0 \4 \374%\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.870951 read(6, "|\25\3\0@\274\202\262\254\230\1\0\364\0\20\1\0 \4 \0&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.870986 read(6, "|\25\3\0\220\f\270\266\24\t\1\0\364\0\20\1\0 \4 \5&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871021 read(6, "|\25\3\0008%\200\262r%\1\0\364\0\20\1\0 \4 \4&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000011>
08:47:28.871056 semop(41779221, [{8, 1, 0}], 1) = 0 <0.000010>
08:47:28.871088 read(6, "|\25\3\0\220[\202\262\241\230\1\0\364\0\20\1\0 \4 \6&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871124 read(6, "|\25\3\0\350\247\200\262\32^\1\0\364\0\20\1\0 \4 \1&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871158 read(6, "|\25\3\0P\373\262\266D\262\1\0\364\0\20\1\0 \4 \373%\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871192 read(6, "|\25\3\0\360\240\201\262\345\206\1\0\364\0\20\1\0 \4 \3&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871227 read(6, "|\25\3\0p\372\262\266\0h\1\0\364\0\20\1\0 \4 \2&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871262 read(6, "|\25\3\0@\370\262\266\313\267\1\0\364\0\20\1\0 \4 \7&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000011>
(...)
then stopped strace and ran:
$ sudo lsof -p $(ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}')
(...)
postmaste 4722 postgres 6u REG 8,1 1073741824 369860117 /opt/prod/pg/9.6/base/18370/1108887031
(...)
It's always /opt/prod/pg/9.6/base/18370/1108887031. Unfortunately, no trace of it in pg_class:
prod=# SELECT relname FROM pg_class WHERE oid = 1108887031;
relname
---------
(0 rows)
prod=# SELECT relname FROM pg_class WHERE relfilenode = 1108887031;
relname
---------
(0 rows)
What does pg backend do on startup?
I would be thankful for suggestions on how to troubleshoot it.
Versions:
PostgreSQL 9.6.21 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
Linux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
--
Thanks,
Marcin
Here is what I tried:
-bash-4.2$ date; psql prod -c "SELECT backend_start, xact_start, query_start FROM pg_stat_activity WHERE pid = (SELECT pg_backend_pid());"; date
Wed Jun 16 09:01:51 EDT 2021
backend_start | xact_start | query_start
-------------------------------+-------------------------------+-------------------------------
2021-06-16 09:01:51.596197-04 | 2021-06-16 09:01:57.979979-04 | 2021-06-16 09:01:57.979979-04
(1 row)
Wed Jun 16 09:01:57 EDT 2021
So the backend started quickly, but it took 6 seconds to start running the query.
I used
ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}'
to identify the PID the most recent pg backend started by myself in startup mode.
I straced PG backend after running psql:
$ sudo strace -tt -T -v -p $(ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}')
(...)
08:47:28.870917 read(6, "|\25\3\0\0|\v\335$!\1\0\364\0\20\1\0 \4 \374%\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.870951 read(6, "|\25\3\0@\274\202\262\254\230\1\0\364\0\20\1\0 \4 \0&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.870986 read(6, "|\25\3\0\220\f\270\266\24\t\1\0\364\0\20\1\0 \4 \5&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871021 read(6, "|\25\3\0008%\200\262r%\1\0\364\0\20\1\0 \4 \4&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000011>
08:47:28.871056 semop(41779221, [{8, 1, 0}], 1) = 0 <0.000010>
08:47:28.871088 read(6, "|\25\3\0\220[\202\262\241\230\1\0\364\0\20\1\0 \4 \6&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871124 read(6, "|\25\3\0\350\247\200\262\32^\1\0\364\0\20\1\0 \4 \1&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871158 read(6, "|\25\3\0P\373\262\266D\262\1\0\364\0\20\1\0 \4 \373%\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871192 read(6, "|\25\3\0\360\240\201\262\345\206\1\0\364\0\20\1\0 \4 \3&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871227 read(6, "|\25\3\0p\372\262\266\0h\1\0\364\0\20\1\0 \4 \2&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000010>
08:47:28.871262 read(6, "|\25\3\0@\370\262\266\313\267\1\0\364\0\20\1\0 \4 \7&\4\6p\237\30\1\340\236\30\1"..., 8192) = 8192 <0.000011>
(...)
then stopped strace and ran:
$ sudo lsof -p $(ps --sort=start_time -elfy | grep "postgres prod .*startup" | head -n1 | awk '{print $3}')
(...)
postmaste 4722 postgres 6u REG 8,1 1073741824 369860117 /opt/prod/pg/9.6/base/18370/1108887031
(...)
It's always /opt/prod/pg/9.6/base/18370/1108887031. Unfortunately, no trace of it in pg_class:
prod=# SELECT relname FROM pg_class WHERE oid = 1108887031;
relname
---------
(0 rows)
prod=# SELECT relname FROM pg_class WHERE relfilenode = 1108887031;
relname
---------
(0 rows)
What does pg backend do on startup?
I would be thankful for suggestions on how to troubleshoot it.
Versions:
PostgreSQL 9.6.21 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
Linux 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
--
Thanks,
Marcin