Search Postgresql Archives

Re: pg_dump crashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/22/20 6:40 AM, Nico De Ranter wrote:
I was just trying that.  It's always the same (huge) table that crashes the pg_dump.   Running a dump excluding that one table goes fine, running a dump of only that one table crashes.
In the system logs I always see a segfault

May 22 15:22:14 core4 kernel: [337837.874618] postgres[1311]: segfault at 7f778008ed0d ip 000055f197ccc008 sp 00007ffdd1fc15a8 error 4 in postgres[55f1977c0000+727000]

It doesn't seem to be an Out-of-memory thing (at least not on the OS level).
The database is currently installed on a dedicated server with 32GB RAM.   I tried tweaking some of the memory parameters for postgres, but the crash always happens at the exact same spot (if I run pg_dump for that one table with and without memory tweaks the resulting files are identical).

One thing I just noticed looking at the dump file: at around the end of the file I see this:

So the below is the output from?:

pg_dumpall --cluster 11/main --file=dump.sql


2087983804 516130 37989 2218636 3079067 0 0 P4B BcISC IGk L BOT BOP A jC BAA I BeMj/b BceUl6 BehUAn 0Ms A C I4p9CBfUiSeAPU4eDuipKQ *4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1145127487 1413694803 21071 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 6071772946555290175 1056985679 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ?????????????????????????????? 4557430888798830399 1061109567 1061109567 1061109567 1061109567 16191 \N \N ??????????????????????????????* 2087983833 554418 37989 5405605 14507502 0 0 P4B Bb8c/ IGk L BOS BOP A Lfh BAA Bg BeMj+2 Bd1LVN BehUAl rlx ABA TOR

It looks suspicious however there are about 837 more lines before the output stops.

Nico

On Fri, May 22, 2020 at 3:27 PM Adrian Klaver <adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:

    On 5/22/20 5:37 AM, Nico De Ranter wrote:
     > Hi all,
     >
     > Postgres version: 9.5
     > OS: Ubuntu 18.04.4
     >
     > I have a 144GB Bacula database that crashes the postgres daemon
    when I
     > try to do a pg_dump.
     > At some point the server ran out of diskspace for the database
    storage.
     > I expanded the lvm and rebooted the server. It seemed to work fine,
     > however when I try to dump the bacula database the postgres
    daemon dies
     > after about 37GB.
     >
     > I tried copying the database to another machine and upgrading
    postgres
     > to 11 using pg_upgrade.  The upgrade seems to work but I still get
     > exactly the same problem when trying to dump the database.
     >
     > postgres@core4:~$ pg_dumpall --cluster 11/main --file=dump.sql
     > pg_dump: Dumping the contents of table "file" failed:
    PQgetCopyData()
     > failed.
     > pg_dump: Error message from server: server closed the connection
     > unexpectedly
     > This probably means the server terminated abnormally
     > before or while processing the request.
     > pg_dump: The command was: COPY public.file (fileid, fileindex,
    jobid,
     > pathid, filenameid, deltaseq, markid, lstat, md5) TO stdout;
     > pg_dumpall: pg_dump failed on database "bacula", exiting

    What happens if you try to dump just this table?

    Something along lines of:

    pg_dump -t file -d some_db -U some_user

    Have you looked at the system logs to see if it is the OS killing the
    process?


     >
     > In the logs I see:
     >
     > 2020-05-22 14:23:30.649 CEST [12768] LOG:  server process (PID
    534) was
     > terminated by signal 11: Segmentation fault
     > 2020-05-22 14:23:30.649 CEST [12768] DETAIL:  Failed process was
     > running: COPY public.file (fileid, fileindex, jobid, pathid,
    filenameid,
     > deltaseq, markid, lstat, md5) TO stdout;
     > 2020-05-22 14:23:30.651 CEST [12768] LOG:  terminating any other
    active
     > server processes
     > 2020-05-22 14:23:30.651 CEST [482] WARNING:  terminating connection
     > because of crash of another server process
     > 2020-05-22 14:23:30.651 CEST [482] DETAIL:  The postmaster has
    commanded
     > this server process to roll back the current transaction and exit,
     > because another server process exited abnormally and possibly
    corrupted
     > shared memory.
     > 2020-05-22 14:23:30.651 CEST [482] HINT:  In a moment you should
    be able
     > to reconnect to the database and repeat your command.
     > 2020-05-22 14:23:30.652 CEST [12768] LOG:  all server processes
     > terminated; reinitializing
     > 2020-05-22 14:23:30.671 CEST [578] LOG:  database system was
     > interrupted; last known up at 2020-05-22 14:15:19 CEST
     > 2020-05-22 14:23:30.809 CEST [578] LOG:  database system was not
     > properly shut down; automatic recovery in progress
     > 2020-05-22 14:23:30.819 CEST [578] LOG:  redo starts at 197/D605EA18
     > 2020-05-22 14:23:30.819 CEST [578] LOG:  invalid record length at
     > 197/D605EA50: wanted 24, got 0
     > 2020-05-22 14:23:30.819 CEST [578] LOG:  redo done at 197/D605EA18
     > 2020-05-22 14:23:30.876 CEST [12768] LOG:  database system is
    ready to
     > accept connections
     > 2020-05-22 14:29:07.511 CEST [12768] LOG:  received fast shutdown
    request
     >
     >
     > Any ideas how to fix or debug this?
     >
     > Nico
     >
     > --
     >
     > Nico De Ranter
     >
     > Operations Engineer
     >
     > T. +32 16 38 72 10
     >
     >
     > <http://www.esaturnus.com>
     >
     > <http://www.esaturnus.com>
     >
     >
     > eSATURNUS
     > Philipssite 5, D, box 28
     > 3001 Leuven – Belgium
     >
     >
     >
     > T. +32 16 40 12 82
     > F. +32 16 40 84 77
     > www.esaturnus.com <http://www.esaturnus.com>
    <http://www.esaturnus.com>
     >
     > ** <http://www.esaturnus.com/>
     >
     > *For Service & Support :*
     >
     > Support Line Belgium: +32 2 2009897
     >
     > Support Line International: +44 12 56 68 38 78
     >
     > Or via email : medical.services.eu@xxxxxxxx
    <mailto:medical.services.eu@xxxxxxxx>
     > <mailto:medical.services.eu@xxxxxxxx
    <mailto:medical.services.eu@xxxxxxxx>>
     >
     >


-- Adrian Klaver
    adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>



--

Nico De Ranter

Operations Engineer

T. +32 16 38 72 10


<http://www.esaturnus.com>

<http://www.esaturnus.com>


eSATURNUS
Philipssite 5, D, box 28
3001 Leuven – Belgium

	

T. +32 16 40 12 82
F. +32 16 40 84 77
www.esaturnus.com <http://www.esaturnus.com>

** <http://www.esaturnus.com/>

*For Service & Support :*

Support Line Belgium: +32 2 2009897

Support Line International: +44 12 56 68 38 78

Or via email : medical.services.eu@xxxxxxxx <mailto:medical.services.eu@xxxxxxxx>




--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux