Search Postgresql Archives

Re: postgresql 9.4 restart

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Etc/UTC


From: Adrian Klaver <adrian.klaver@xxxxxxxxxxx>
Sent: Thursday, May 9, 2019 7:51 AM
To: Julie Nishimura
Cc: pgsql-general@xxxxxxxxxxxxxxxxxxxx; pgsql-general
Subject: Re: postgresql 9.4 restart
 
On 5/8/19 4:35 PM, Julie Nishimura wrote:
> Thanks again for your help.
>
> 1) max_connections is set to 500 in our config file
> 2) our average conn load is 300-400, however, I am not sure if it was
> close to max_conn at the moment of restart
> 3) we do not have replication on this staging server
> 4) These are the snippets from auth logs with the corresponding time:

I do not see anything that matches.

Do you know what time zone the log entries are in?
cat /etc/timezone might help.

>
> /var/message/syslog:
>
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10290]: (postgres) CMD
> (/usr/bin/find /data/postgresql/9.4/log -type f ! -name '*.bz2' -mmin
> +480 -exec bzip2 {} \;)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10323]: (postgres) CMD
> (/usr/bin/find /data/postgresql/9.4/log -type f -empty -mmin +60 -delete)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10379]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10655]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10705]: (root) CMD (  
> cd / && run-parts --report /etc/cron.hourly)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10902]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11401]: (root) LIST
> (root)
> May  7 00:27:32 hitw-esg-portal-stage-db-1 crontab[11403]: (root) LIST
> (postgres)
> May  7 00:28:24 hitw-esg-portal-stage-db-1 puppet-agent[10914]:
> (/Stage[main]/Ci::Db/Package[postgresql-plpython-]/ensure) ensure
> changed 'purged' to 'present'
> May  7 00:28:40 hitw-esg-portal-stage-db-1 puppet-agent[10914]: Finished
> catalog run in 106.68 seconds
> May  7 00:35:01 hitw-esg-portal-stage-db-1 CRON[14398]: (root) CMD
> (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
>
> this is from auth.log:
>
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:01:01 hitw-esg-portal-stage-db-1 CRON[10265]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:02:01 hitw-esg-portal-stage-db-1 CRON[10289]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:02:29 hitw-esg-portal-stage-db-1 sshd[10306]: Connection
> closed by ::1 port 49936 [preauth]
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
> pam_unix(cron:session): session opened for user postgres by (uid=0)
> May  7 00:03:01 hitw-esg-portal-stage-db-1 CRON[10322]:
> pam_unix(cron:session): session closed for user postgres
> May  7 00:04:30 hitw-esg-portal-stage-db-1 sshd[10365]: Connection
> closed by ::1 port 49942 [preauth]
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:05:01 hitw-esg-portal-stage-db-1 CRON[10378]:
> pam_unix(cron:session): session closed for user root
> May  7 00:06:31 hitw-esg-portal-stage-db-1 sshd[10422]: Connection
> closed by ::1 port 49948 [preauth]
> May  7 00:08:33 hitw-esg-portal-stage-db-1 sshd[10483]: Connection
> closed by ::1 port 49954 [preauth]
> May  7 00:10:34 hitw-esg-portal-stage-db-1 sshd[10540]: Connection
> closed by ::1 port 49960 [preauth]
> May  7 00:12:35 hitw-esg-portal-stage-db-1 sshd[10594]: Connection
> closed by ::1 port 49966 [preauth]
> May  7 00:14:36 hitw-esg-portal-stage-db-1 sshd[10642]: Connection
> closed by ::1 port 49972 [preauth]
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:15:01 hitw-esg-portal-stage-db-1 CRON[10654]:
> pam_unix(cron:session): session closed for user root
> May  7 00:16:37 hitw-esg-portal-stage-db-1 sshd[10691]: Connection
> closed by ::1 port 49978 [preauth]
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:17:01 hitw-esg-portal-stage-db-1 CRON[10704]:
> pam_unix(cron:session): session closed for user root
> May  7 00:18:39 hitw-esg-portal-stage-db-1 sshd[10741]: Connection
> closed by ::1 port 49984 [preauth]
> May  7 00:20:40 hitw-esg-portal-stage-db-1 sshd[10787]: Connection
> closed by ::1 port 49990 [preauth]
> May  7 00:22:41 hitw-esg-portal-stage-db-1 sshd[10833]: Connection
> closed by ::1 port 49996 [preauth]
> May  7 00:24:42 hitw-esg-portal-stage-db-1 sshd[10890]: Connection
> closed by ::1 port 50002 [preauth]
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
> pam_unix(cron:session): session opened for user root by (uid=0)
> May  7 00:25:01 hitw-esg-portal-stage-db-1 CRON[10901]:
> pam_unix(cron:session): session closed for user root
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
> format=unaligned -c SELECT datname, pg_encoding_to_char(encoding) FROM
> pg_database WHERE datistemplate = false AND datname != 'postgres'
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session opened for user postgres by (uid=0)
> May  7 00:25:44 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session closed for user postgres
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo:     root : TTY=unknown
> ; PWD=/ ; USER=postgres ; COMMAND=/usr/bin/psql -q -t -P
> format=unaligned -c SHOW server_version
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session opened for user postgres by (uid=0)
> May  7 00:25:51 hitw-esg-portal-stage-db-1 sudo: pam_unix(sudo:session):
> session closed for user postgres
> May  7 00:26:44 hitw-esg-portal-stage-db-1 sshd[11327]: Connection
> closed by ::1 port 50018 [preauth]
>
>
> For me, it seems like there is some cron job ran shortly after midnight
> (either from root or through puppet and restarted the server, because
> maybe it was close to max connections - just a guess). What do you think?
>
> -Julie N
>
>
> ------------------------------------------------------------------------
> *From:* Adrian Klaver <adrian.klaver@xxxxxxxxxxx>
> *Sent:* Wednesday, May 8, 2019 2:17 PM
> *To:* Julie Nishimura
> *Cc:* pgsql-general@xxxxxxxxxxxxxxxxxxxx; pgsql-general
> *Subject:* Re: postgresql 9.4 restart
> On 5/8/19 2:01 PM, Julie Nishimura wrote:
>> Adrian, thanks for your reply!
>>
>>
>>
>> PostgreSQL 9.4.21 on x86_64-pc-linux-gnu (Ubuntu 9.4.21-1.pgdg16.04+1),
>> compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609, 64-bit
>>
>> This is what Ive found in logs prior to the restart:
>
>> UTC,"hitwise","hitwise_us_stg",8677,"10.200.193.58:49428",5cd0cc7a.21e5,3,"startup",2019-05-07
>> 00:08:26 UTC,448/970,0,FATAL,53300,"remaining connection slots are
>> reserved for non-replication superuser connections",,,,,,,,,""
>> 2019-05-07 00:08:26.778
>
> The above is not good.
>
> What do you have set for max_connections?
>
> How many connections are you trying against the database?
>
> Do you have replication set up to other Postgres instances(uk, au, us)?
>
> What does the system log show from the same time period?
>
>
>
>> UTC,"hitwise","hitwise_us_stg",8675,"10.200.193.58:49424",5cd0cc7a.21e3,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.021
>> user=hitwise database=hitwise_us_stg host=10.200.193.58
>> port=49424",,,,,,,,,""
>> 2019-05-07 00:08:26.779
>> UTC,"hitwise","hitwise_us_stg",8676,"10.200.193.58:49426",5cd0cc7a.21e4,6,"idle",2019-05-07
>> 00:08:26 UTC,,0,LOG,00000,"disconnection: session time: 0:00:00.018
>> user=hitwise database=hitwise_us_stg host=10.200.193.58
>> port=49426",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,5,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"received fast shutdown request",,,,,,,,,""
>> 2019-05-07 00:08:28.060 UTC,,,4117,,5cbfb2c9.1015,6,,2019-04-24 00:50:17
>> UTC,,0,LOG,00000,"aborting any active transactions",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,6,"idle",2019-05-07
>> 00:06:56 UTC,54/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8559,"10.200.193.58:49248",5cd0cc20.216f,6,"idle",2019-05-07
>> 00:06:56 UTC,56/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8561,"10.200.193.58:49252",5cd0cc20.2171,6,"idle",2019-05-07
>> 00:06:56 UTC,429/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,6,"idle",2019-05-07
>> 00:07:15 UTC,19/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_us_stg",8626,"10.200.250.49:31332",5cd0cc61.21b2,6,"idle",2019-05-07
>> 00:08:01 UTC,3/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8560,"10.200.193.58:49250",5cd0cc20.2170,6,"idle",2019-05-07
>> 00:06:56 UTC,227/0,0,FATAL,57P01,"terminating connection due to
>> administrator command",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_uk_stg",8558,"10.200.193.58:49246",5cd0cc20.216e,7,"idle",2019-05-07
>> 00:06:56 UTC,,0,LOG,00000,"disconnection: session time: 0:01:31.339
>> user=hitwise database=hitwise_uk_stg host=10.200.193.58
>> port=49246",,,,,,,,,""
>> 2019-05-07 00:08:28.060
>> UTC,"hitwise","hitwise_au_stg",8578,"10.200.250.49:31100",5cd0cc33.2182,7,"idle",2019-05-07
>> 00:07:15 UTC,,0,LOG,00000,"disconnection: session time: 0:01:12.215
>> user=hitwise database=hitwise_au_stg host=10.200.250.49
>> port=31100",,,,,,,,,""
>>
>> Does it help?
>>
>> ------------------------------------------------------------------------
>> *From:* Adrian Klaver <adrian.klaver@xxxxxxxxxxx>
>> *Sent:* Wednesday, May 8, 2019 12:07 PM
>> *To:* Julie Nishimura
>> *Cc:* pgsql-general@xxxxxxxxxxxxxxxxxxxx; pgsql-general
>> *Subject:* Re: postgresql 9.4 restart
>> On 5/8/19 11:00 AM, Julie Nishimura wrote:
>>> Hello,
>>> Our staging 9.4 postgresql has been running couple of weeks with no
>>> problem, but yesterday we saw this error from one of our services that
>>> connects to the staging DBs:
>>>
>>> FATAL: terminating connection due to
>>> administrator command; nested exception is
>>> org.postgresql.util.PSQLException: FATAL: terminating connection due to
>>> administrator command”,
>>
>> Distro and version?
>>
>> Well it originated in Java code if that helps?
>>
>> Anything in the Postgres logs just prior to 'FATAL: terminating
>> connection ...' that might be relevant?
>>
>> Do the system logs showing anything relevant e.g. the OOM killer
>> stopping Postgres?
>>
>>
>>
>>>
>>> Then I could see that postgresql got restarted, based on the output of
>>> "select pg_postmaster_start_time()"
>>>
>>> We are running postgresql on
>>> Linux xxx-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64
>>> x86_64 x86_64 GNU/Linux
>>>
>>> How I can find out what cause the restart? Any ideas are welcome. Thanks!
>>>
>>> -Julie N
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@xxxxxxxxxxx
>
>
> --
> Adrian Klaver
> adrian.klaver@xxxxxxxxxxx


--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux