Sorry, forget the attachment. On Mon, 2006-01-02 at 15:24 -0700, warren little wrote: > The dump/restore failed even with the zero_damaged_pages=true. > The the logfile (postgresql-2006-01-02_130023.log) > did not have much in the way of useful info. I've attached the section > of the logfile around the time of the crash. I cannot find any sign of > a core file. Where might the core dump have landed? > > Regarding your comments about losing the evidence, the data I'm trying > to load is in another database in the same cluster which I have no > intention of purging until a can get the table moved to the new > database. > > thanks > > > > > On Mon, 2006-01-02 at 16:34 -0500, Tom Lane wrote: > > warren little <warren.little@xxxxxxxxxxxxxxxxxxx> writes: > > > pg_dump: SQL command failed > > > pg_dump: Error message from server: server closed the connection > > > unexpectedly > > > This probably means the server terminated abnormally > > > before or while processing the request. > > > pg_dump: The command was: FETCH 100 FROM _pg_dump_cursor > > > > Hmm. This could mean corrupted data files, but it's hard to be sure > > without more info. > > > > > I had removed all the files in pg_log prior to getting this error and no > > > new logfile was created. I'm guessing I screwed up the logger when > > > removing all the files, but I assumed that when writing to the error > > > logs the backend would create a file if one did not exist. > > > > The file *does* exist, there's just no directory link to it anymore :-( > > You need to force a logfile rotation, which might be most easily done by > > stopping and restarting the postmaster. > > > > What you need to do is see the postmaster log entry about the backend > > crash. If it's dying on a signal (likely sig11 = SEGV) then inspecting > > the core file might yield useful information. > > > > > I currently attempt to run the dump/restore with the zero_damaged_pages > > > turned on to see if the results yield something more useful. > > > > That really ought to be the last resort not the first one, because it > > will destroy not only data but most of the evidence about what went > > wrong... > > > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your > message can get through to the mailing list cleanly
@ 2006-01-02 15:02:02 MST:LOG: autovacuum: processing database "tigris" @ 2006-01-02 15:03:01 MST:LOG: server process (PID 28772) was terminated by signal 11 @ 2006-01-02 15:03:01 MST:LOG: terminating any other active server processes [local]@[local] 2006-01-02 15:03:01 MST:WARNING: terminating connection because of crash of another server process [local]@[local] 2006-01-02 15:03:01 MST:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. [local]@[local] 2006-01-02 15:03:01 MST:HINT: In a moment you should be able to reconnect to the database and repeat your command. 192.168.19.129(50732)@192.168.19.129 2006-01-02 15:03:01 MST:WARNING: terminating connection because of crash of another server process 192.168.19.129(50732)@192.168.19.129 2006-01-02 15:03:01 MST:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 192.168.19.129(50732)@192.168.19.129 2006-01-02 15:03:01 MST:HINT: In a moment you should be able to reconnect to the database and repeat your command. 192.168.19.129(50730)@192.168.19.129 2006-01-02 15:03:01 MST:WARNING: terminating connection because of crash of another server process 192.168.19.129(50730)@192.168.19.129 2006-01-02 15:03:01 MST:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 192.168.19.129(50730)@192.168.19.129 2006-01-02 15:03:01 MST:HINT: In a moment you should be able to reconnect to the database and repeat your command. 192.168.19.129(50731)@192.168.19.129 2006-01-02 15:03:01 MST:WARNING: terminating connection because of crash of another server process 192.168.19.129(50731)@192.168.19.129 2006-01-02 15:03:01 MST:DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 192.168.19.129(50731)@192.168.19.129 2006-01-02 15:03:01 MST:HINT: In a moment you should be able to reconnect to the database and repeat your command. @ 2006-01-02 15:03:01 MST:LOG: all server processes terminated; reinitializing @ 2006-01-02 15:03:01 MST:LOG: database system was interrupted at 2006-01-02 15:02:47 MST @ 2006-01-02 15:03:01 MST:LOG: checkpoint record is at 37/D60F93A8 @ 2006-01-02 15:03:01 MST:LOG: redo record is at 37/D6008018; undo record is at 0/0; shutdown FALSE @ 2006-01-02 15:03:01 MST:LOG: next transaction ID: 32196280; next OID: 102041945 @ 2006-01-02 15:03:01 MST:LOG: next MultiXactId: 41; next MultiXactOffset: 93 @ 2006-01-02 15:03:01 MST:LOG: database system was not properly shut down; automatic recovery in progress @ 2006-01-02 15:03:01 MST:LOG: redo starts at 37/D6008018 @ 2006-01-02 15:03:01 MST:LOG: record with zero length at 37/D60F93F8 @ 2006-01-02 15:03:01 MST:LOG: redo done at 37/D60F93A8 @ 2006-01-02 15:03:02 MST:LOG: database system is ready @ 2006-01-02 15:03:02 MST:LOG: transaction ID wrap limit is 1087118600, limited by database "cert"