Am Montag, 23. Januar 2006 17:05 schrieb Tom Lane: > Janning Vygen <vygen@xxxxxx> writes: > > pg_dump: ERROR: invalid memory alloc request size 18446744073709551614 > > pg_dump: SQL command to dump the contents of table "spieletipps" failed: > > PQendcopy() failed. > > This looks more like a corrupt-data problem than anything else. Have > you tried the usual memory and disk testing programs? no, i didn't. What are the usual memory and disk testing programs? ( a few weeks ago i wanted to start a troubleshooting guide for guys like me, but i didn't start yet.... this needs to be documented.). I am not a system administrator and a hard disk is a black box to me. By the way: the database is still running and serving requests. > > recent thread on HACKERS but sorry guys: i dont know how to produce a > > backtrace. > > Time to learn ;-) > > gdb /path/to/postgres_executable /path/to/core_file > gdb> bt > gdb> q I shouldn't call gdb while my database is up and running, don't i? I tried to find and delete the corrupted row (as you mentioned in http://archives.postgresql.org/pgsql-admin/2006-01/msg00117.php) I found it: $ select sp_id from spieletipps limit 1 offset 387583; Server beendete die Verbindung unerwartet Das heißt wahrscheinlich, daß der Server abnormal beendete bevor oder während die Anweisung bearbeitet wurde. Die Verbindung zum Server wurde verloren. Versuche Reset: Fehlgeschlagen. !> \q and i can get the ctid: $ select ctid from spieletipps limit 1 offset 387583; ctid ----------- (3397,49) (1 Zeile) but when i want to delete it: $ delete from spieletipps where ctid = '(3397,49)'; Server beendete die Verbindung unerwartet Das heißt wahrscheinlich, daß der Server abnormal beendete bevor oder während die Anweisung bearbeitet wurde. Die Verbindung zum Server wurde verloren. Versuche Reset: Fehlgeschlagen. How can i get rid of it? (I don't have oids in the table, i created them without oids) > > The core file will be somewhere under $PGDATA, named either "core" or > "core.nnnnn" depending on your kernel settings. If you don't see one > then it's probable that the postmaster was started under "ulimit -c 0". > Put "ulimit -c unlimited" in your postgres startup script, restart, > trigger the crash again. > > It's also a good idea to look in the postmaster log to see if any > unusual messages appeared before the crash. this is form the postmaster log: LOG: server process (PID 14756) was terminated by signal 11 LOG: terminating any other active server processes LOG: all server processes terminated; reinitializing FATAL: the database system is starting up LOG: database system was interrupted at 2006-01-23 09:46:03 CET LOG: checkpoint record is at 1/D890C0E0 LOG: redo record is at 1/D88F93E8; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 485068; next OID: 16882321 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 1/D88F93E8 LOG: record with zero length at 1/D8953988 LOG: redo done at 1/D8953920 LOG: database system is ready LOG: server process (PID 15198) was terminated by signal 11 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. FATAL: the database system is in recovery mode LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2006-01-23 09:46:15 CET LOG: checkpoint record is at 1/D8953988 LOG: redo record is at 1/D8953988; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 485130; next OID: 16882321 LOG: database system was not properly shut down; automatic recovery in progress LOG: redo starts at 1/D89539D0 LOG: record with zero length at 1/D8966BF8 LOG: redo done at 1/D8966BC8 LOG: database system is ready LOG: server process (PID 15400) was terminated by signal 11 LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2006-01-23 09:46:24 CET LOG: checkpoint record is at 1/D8966BF8 LOG: redo record is at 1/D8966BF8; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 485183; next OID: 16882321 LOG: database system was not properly shut down; automatic recovery in progress FATAL: the database system is starting up LOG: redo starts at 1/D8966C40 LOG: record with zero length at 1/D8991CC8 LOG: redo done at 1/D8991C98 LOG: database system is ready any further help is very appreciated, kind regards janning