Search Postgresql Archives

Re: postgresql 9.4 restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/8/19 4:35 PM, Julie Nishimura wrote:
Thanks again for your help.

1) max_connections is set to 500 in our config file
2) our average conn load is 300-400, however, I am not sure if it was close to max_conn at the moment of restart
3) we do not have replication on this staging server
4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.


/var/message/syslog:

May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin +480 -exec bzip2 {} \;) May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete) May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1) May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST (root) May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST (postgres) May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]: (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure changed 'purged' to 'present' May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished catalog run in 106.68 seconds May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)

this is from auth.log:

May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session opened for user postgres by (uid=0) May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]: pam_unix(cron:session): session closed for user postgres May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session opened for user postgres by (uid=0) May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]: pam_unix(cron:session): session closed for user postgres May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection closed by ::1 port 49936 [preauth] May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session opened for user postgres by (uid=0) May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]: pam_unix(cron:session): session closed for user postgres May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection closed by ::1 port 49942 [preauth] May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session opened for user root by (uid=0) May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]: pam_unix(cron:session): session closed for user root May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection closed by ::1 port 49948 [preauth] May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection closed by ::1 port 49954 [preauth] May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection closed by ::1 port 49960 [preauth] May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection closed by ::1 port 49966 [preauth] May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection closed by ::1 port 49972 [preauth] May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session opened for user root by (uid=0) May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]: pam_unix(cron:session): session closed for user root May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection closed by ::1 port 49978 [preauth] May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session opened for user root by (uid=0) May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]: pam_unix(cron:session): session closed for user root May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection closed by ::1 port 49984 [preauth] May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection closed by ::1 port 49990 [preauth] May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection closed by ::1 port 49996 [preauth] May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection closed by ::1 port 50002 [preauth] May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session opened for user root by (uid=0) May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]: pam_unix(cron:session): session closed for user root May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM pg_database WHERE datistemplate = false AND datname != 'postgres' May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0) May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P format=unaligned -c SHOW server_version May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session opened for user postgres by (uid=0) May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session): session closed for user postgres May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection closed by ::1 port 50018 [preauth]


For me, it seems like there is some cron job ran shortly after midnight (either from root or through puppet and restarted the server, because maybe it was close to max connections - just a guess). What do you think?

-Julie N


------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@xxxxxxxxxxx>
*Sent:* Wednesday, May 8, 2019 2:17 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@xxxxxxxxxxxxxxxxxxxx; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 2:01 PM, Julie Nishimura wrote:
Adrian, thanks for your reply!



PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1), compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit

This is what Ive found in logs prior to the restart:

UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are reserved for non-replication superuser connections",,,,,,,,,"" 2019-05-07 00:08:26.778

The above is not good.

What do you have set for max_connections?

How many connections are you trying against the database?

Do you have replication set up to other Postgres instances(uk, au, us)?

What does the system log show from the same time period?



UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49424",,,,,,,,,"" 2019-05-07 00:08:26.779 UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018 user=hitwise database=hitwise_us_stg host=10.200.193.58 port=49426",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17 UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to administrator command",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339 user=hitwise database=hitwise_uk_stg host=10.200.193.58 port=49246",,,,,,,,,"" 2019-05-07 00:08:28.060 UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215 user=hitwise database=hitwise_au_stg host=10.200.250.49 port=31100",,,,,,,,,""

Does it help?

------------------------------------------------------------------------
*From:* Adrian Klaver <adrian.klaver@xxxxxxxxxxx>
*Sent:* Wednesday, May 8, 2019 12:07 PM
*To:* Julie Nishimura
*Cc:* pgsql-general@xxxxxxxxxxxxxxxxxxxx; pgsql-general
*Subject:* Re: postgresql 9.4 restart
On 5/8/19 11:00 AM, Julie Nishimura wrote:
Hello,
Our staging 9.4 postgresql has been running couple of weeks with no problem, but yesterday we saw this error from one of our services that connects to the staging DBs:

FATAL: terminating connection due to
administrator command; nested exception is
org.postgresql.util.PSQLException: FATAL: terminating connection due to
administrator command”,

Distro and version?

Well it originated in Java code if that helps?

Anything in the Postgres logs just prior to 'FATAL: terminating
connection ...' that might be relevant?

Do the system logs showing anything relevant e.g. the OOM killer
stopping Postgres?




Then I could see that postgresql got restarted, based on the output of "select pg_postmaster_start_time()"

We are running postgresql on
Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

How I can find out what cause the restart? Any ideas are welcome. Thanks!

-Julie N




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux