Re: Segfault on postgresql 12.3

Julien Rouhaud <rjuju123@xxxxxxxxx> · Fri, 21 Aug 2020 14:34:14 +0200

On Fri, Aug 21, 2020 at 2:25 PM Thomas SIMON <tsimon@xxxxxxxxxxx> wrote:
>
> Hi all,
>
> I just had strange behavior on my postgresql instance, with postgresql
> auto restart
>
> Looking for logs, I've found a segfault in kern.log
>
> [12:24:09]root@db12:~# cat /var/log/kern.log
> 2020-08-21T12:00:01.436378+02:00 db12 kernel: postgres[177990]: segfault
> at 0 ip 00005636d2d844f1 sp 00007fff4fa69910 error 4 in
> postgres[5636d2cb7000+775000]
>
> I've also enabled core dump, file output is :
>
> [12:24:13]root@db12:~# file /data/postgresql/12/main/core
> /data/postgresql/12/main/core: ELF 64-bit LSB core file x86-64, version
> 1 (SYSV), SVR4-style, from 'postgres: 12/main: supervision neteven2
> localhost(34868) SELECT', real uid: 110, effective uid: 110, real gid:
> 114, effective gid: 114, execfn: '/usr/lib/postgresql/12/bin/postgres',
> platform: 'x86_64'
>
>
> In logs , I have these messages
>
> 2020-08-21 12:00:01.451 CEST [274137]: [299-1] user=,db=,app=,client=
> LOG:  server process (PID 177990) was terminated by signal 11:
> Segmentation fault
> 2020-08-21 12:00:01.451 CEST [274137]: [300-1] user=,db=,app=,client=
> DETAIL:  Failed process was running: SELECT usename,count(*) FROM
> pg_stat_activity WHERE pid != pg_backend_pid() GROUP BY usename ORDER BY 1
>
> ..
> 2020-08-21 12:00:02.776 CEST [274137]: [302-1] user=,db=,app=,client=
> LOG:  archiver process (PID 274215) exited with exit code 1
> 2020-08-21 12:00:02.774 CEST [274214]: [1-1] user=,db=,app=,client=
> WARNING:  terminating connection because of crash of another server process
> 2020-08-21 12:00:02.774 CEST [274214]: [2-1] user=,db=,app=,client=
> DETAIL:  The postmaster has commanded this server process to roll back
> the current transaction and exit, because another s
> erver process exited abnormally and possibly corrupted shared memory.
> 2020-08-21 12:00:02.774 CEST [274214]: [3-1] user=,db=,app=,client=
> HINT:  In a moment you should be able to reconnect to the database and
> repeat your command.
> (many times until full restart)
>
>
> I'm on 12.3 version, on a dedicated host on prem.

Note that version 12.4 is now available, however I don't see any relevant fix.

> root@db12:~# dpkg -l | grep postgresql
> ii  pgdg-keyring 2018.2                                 all
> keyring for apt.postgresql.org
> ii  postgresql-12 12.3-1.pgdg90+1                        amd64
> object-relational SQL database, version 12 server
> ii  postgresql-12-repmgr 5.1.0-1.stretch+1
> amd64        replication manager for PostgreSQL 12
> ii  postgresql-client-12 12.3-1.pgdg90+1
> amd64        front-end programs for PostgreSQL 12
> ii  postgresql-client-common 215.pgdg90+1
> all          manager for multiple PostgreSQL client versions
> ii  postgresql-common 215.pgdg90+1
> all          PostgreSQL database-cluster manager
> ii  postgresql-server-dev-12 12.3-1.pgdg90+1
> amd64        development files for PostgreSQL 12 server-side programming
>
>
> Could you please help me to find what is the root cause ? for

This is unfortunately not enough information to find the root issue.

Do you have any custom extension?  Is there any chance you can get a
backtrace of the generated coredump? See
https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_randomly_crashing_backend
for more details on how to do that.