> "Roberts, Jon" <Jon.Roberts@xxxxxxxxxxx> writes: > > Version: "PostgreSQL 8.3.0, compiled by Visual C++ build 1400" > > Well, there are plenty of known bugs in 8.3.0 by now. You really > should update before complaining, not after. I'm not complaining. I just want to make sure that if I upgrade, it will fix the problem. An upgrade could possible introduce a new problem. I also wonder if this is isolated to Win32 because we are upgrading to Solaris very soon. > > > Problem: My database keeps on crashing every few days with this type of > > error message: > > > 2008-07-01 10:46:30 CDT LOG: all server processes terminated; > > reinitializing > > I think your real problem is with what happened *before* that. I found the first instance of "crash" and then got the rest of the log file. 2008-07-01 10:43:42 CDT LOG: server process (PID 3524) exited with exit code 128 2008-07-01 10:43:42 CDT LOG: terminating any other active server processes 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: SQL statement "SELECT pg_sleep(cast(control.fn_get_variable('sleep_time') as int))" PL/pgSQL function "fn_update_status" line 136 at PERFORM SQL statement "SELECT gp_load.fn_update_status( $1 , $2 , 'Processing', '', 0)" PL/pgSQL function "fn_load" line 56 at PERFORM 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: PL/pgSQL function "fn_get_job_details" line 114 at IF PL/pgSQL function "fn_load" line 465 at FOR over SELECT rows 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: SQL statement "SELECT pg_sleep(cast(control.fn_get_variable('sleep_time') as int))" PL/pgSQL function "fn_update_status" line 136 at PERFORM SQL statement "SELECT gp_load.fn_update_status( $1 , $2 , 'Processing', '', 0)" PL/pgSQL function "fn_load" line 56 at PERFORM 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: SQL statement "SELECT pg_sleep(cast(control.fn_get_variable('sleep_time') as int))" PL/pgSQL function "fn_update_status" line 136 at PERFORM SQL statement "SELECT gp_load.fn_update_status( $1 , $2 , 'Processing', '', 0)" PL/pgSQL function "fn_load" line 56 at PERFORM 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: SQL statement "SELECT pg_sleep(cast(control.fn_get_variable('sleep_time') as int))" PL/pgSQL function "fn_update_status" line 136 at PERFORM SQL statement "SELECT gp_load.fn_update_status( $1 , $2 , 'Processing', '', 0)" PL/pgSQL function "fn_load" line 56 at PERFORM 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT CONTEXT: SQL statement "SELECT pg_sleep(cast(control.fn_get_variable('sleep_time') as int))" PL/pgSQL function "fn_update_status" line 136 at PERFORM SQL statement "SELECT gp_load.fn_update_status( $1 , $2 , 'Processing', '', 0)" PL/pgSQL function "fn_load" line 56 at PERFORM 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:42 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:42 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:42 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. INFO: 2008-07-01 10:44:42.373008-05:3000:Executing Query ..... create temporary table order_iud on commit drop as select s."jid",s."order_id",s."parent_order_id",s."claim_id",s."fulfillment_met hod",s."order_status_id",s."payment_status",s."created_by",s."created_dt m",s."modified_by",s."modified_dtm",s."deleted_flg",s."claimed_item_id", s."order_state_hashcode",s."autopaymentauth_flg",s."salvageexpectation_s tatus",s."snr_fee",s."salvageexpectation_reason",s."salvageexpectation_n ote",s."nocharge_override",s."change_datetime",s."change_type",s."edw_pr ocess_flag",s."edw_process_ts" from stage_intelliset.order s, (select max("order".jid) as jid, "order".order_id from stage_intelliset."order" group by "order".order_id) m where m.jid = s.jid and m.order_id::text = s.order_id::text distributed by (order_id) INFO: 2008-07-01 10:44:43.185777-05:3000:Execution Complete. INFO: 2008-07-01 10:44:46.988228-05:5000:Executing Update ..... update replica_intelliset.invoiceautopayment set "order_id" = y."order_id", "invoice_id" = y."invoice_id", "transaction_id" = y."transaction_id", "paymenttype_code" = y."paymenttype_code", "amount" = y."amount", "status_code" = y."status_code", "deleted_flg" = y."deleted_flg", "created_by" = y."created_by", "created_dtm" = y."created_dtm", "modified_by" = y."modified_by", "modified_dtm" = y."modified_dtm", edw_modified_dt = y.edw_process_ts, edw_modified_id = y.jid, edw_clock_ts = clock_timestamp() from (select a."invoiceautopayement_id", a."order_id", a."invoice_id", a."transaction_id", a."paymenttype_code", a."amount", a."status_code", a."deleted_flg", a."created_by", a."created_dtm", a."modified_by", a."modified_dtm", a.edw_process_ts, a.jid, a.change_type from invoiceautopayment_iud a inner join replica_intelliset.invoiceautopayment b on a.invoiceautopayement_id = b.invoiceautopayement_id) y where replica_intelliset.invoiceautopayment.invoiceautopayement_id = y.invoiceautopayement_id and y.change_type <> 3 INFO: 2008-07-01 10:44:48.785215-05:5000:Update Complete. INFO: 2008-07-01 10:44:48.786175-05:6000:Executing Insert ..... insert into replica_intelliset.invoiceautopayment ("invoiceautopayement_id", "order_id", "invoice_id", "transaction_id", "paymenttype_code", "amount", "status_code", "deleted_flg", "created_by", "created_dtm", "modified_by", "modified_dtm", edw_created_dt, edw_modified_dt, edw_created_id, edw_modified_id, edw_clock_ts) select a."invoiceautopayement_id", a."order_id", a."invoice_id", a."transaction_id", a."paymenttype_code", a."amount", a."status_code", a."deleted_flg", a."created_by", a."created_dtm", a."modified_by", a."modified_dtm", edw_process_ts as edw_created_dt, null as edw_modified_dt, a.jid as edw_created_id, null as edw_modified_id, clock_timestamp() as edw_clock_ts from invoiceautopayment_iud a left outer join replica_intelliset.invoiceautopayment b on a.invoiceautopayement_id = b.invoiceautopayement_id where (b.invoiceautopayement_id is null) and a.change_type <> 3 INFO: 2008-07-01 10:44:50.738058-05:6000:Insert Complete. INFO: 2008-07-01 10:44:50.750247-05:8000:Executing insert into archive table..... insert into stage_intelliset.invoiceautopayment_arch select * from stage_intelliset.invoiceautopayment INFO: 2008-07-01 10:44:50.839105-05:8000:Insert complete 2008-07-01 10:43:55 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:55 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:55 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:55 CDT CONTEXT: SQL statement "select * from dblink('gp', $1 ) as t1 (return_text varchar)" PL/pgSQL function "fn_remote_sql" line 63 at SQL statement SQL statement "SELECT gp_execute.fn_remote_sql( $1 , $2 )" PL/pgSQL function "fn_load" line 393 at PERFORM 2008-07-01 10:43:55 CDT LOG: could not send data to client: No connection could be made because the target machine actively refused it. 2008-07-01 10:43:55 CDT CONTEXT: SQL statement "select * from dblink('gp', $1 ) as t1 (return_text varchar)" PL/pgSQL function "fn_remote_sql" line 63 at SQL statement SQL statement "SELECT gp_execute.fn_remote_sql( $1 , $2 )" PL/pgSQL function "fn_load" line 393 at PERFORM 2008-07-01 10:43:55 CDT STATEMENT: select * from gp_load.fn_load(809) INFO: 2008-07-01 10:44:53.103065-05:5000:Executing Update ..... update replica_intelliset.order set "parent_order_id" = y."parent_order_id", "claim_id" = y."claim_id", "fulfillment_method" = y."fulfillment_method", "order_status_id" = y."order_status_id", "payment_status" = y."payment_status", "created_by" = y."created_by", "created_dtm" = y."created_dtm", "modified_by" = y."modified_by", "modified_dtm" = y."modified_dtm", "deleted_flg" = y."deleted_flg", "claimed_item_id" = y."claimed_item_id", "order_state_hashcode" = y."order_state_hashcode", "autopaymentauth_flg" = y."autopaymentauth_flg", "salvageexpectation_status" = y."salvageexpectation_status", "snr_fee" = y."snr_fee", "salvageexpectation_reason" = y."salvageexpectation_reason", "salvageexpectation_note" = y."salvageexpectation_note", "nocharge_override" = y."nocharge_override", edw_modified_dt = y.edw_process_ts, edw_modified_id = y.jid, edw_clock_ts = clock_timestamp() from (select a."order_id", a."parent_order_id", a."claim_id", a."fulfillment_method", a."order_status_id", a."payment_status", a."created_by", a."created_dtm", a."modified_by", a."modified_dtm", a."deleted_flg", a."claimed_item_id", a."order_state_hashcode", a."autopaymentauth_flg", a."salvageexpectation_status", a."snr_fee", a."salvageexpectation_reason", a."salvageexpectation_note", a."nocharge_override", a.edw_process_ts, a.jid, a.change_type from order_iud a inner join replica_intelliset.order b on a.order_id = b.order_id) y where replica_intelliset.order.order_id = y.order_id and y.change_type <> 3 INFO: 2008-07-01 10:44:54.536366-05:5000:Update Complete. INFO: 2008-07-01 10:44:54.537312-05:6000:Executing Insert ..... insert into replica_intelliset.order ("order_id", "parent_order_id", "claim_id", "fulfillment_method", "order_status_id", "payment_status", "created_by", "created_dtm", "modified_by", "modified_dtm", "deleted_flg", "claimed_item_id", "order_state_hashcode", "autopaymentauth_flg", "salvageexpectation_status", "snr_fee", "salvageexpectation_reason", "salvageexpectation_note", "nocharge_override", edw_created_dt, edw_modified_dt, edw_created_id, edw_modified_id, edw_clock_ts) select a."order_id", a."parent_order_id", a."claim_id", a."fulfillment_method", a."order_status_id", a."payment_status", a."created_by", a."created_dtm", a."modified_by", a."modified_dtm", a."deleted_flg", a."claimed_item_id", a."order_state_hashcode", a."autopaymentauth_flg", a."salvageexpectation_status", a."snr_fee", a."salvageexpectation_reason", a."salvageexpectation_note", a."nocharge_override", edw_process_ts as edw_created_dt, null as edw_modified_dt, a.jid as edw_created_id, null as edw_modified_id, clock_timestamp() as edw_clock_ts from order_iud a left outer join replica_intelliset.order b on a.order_id = b.order_id where (b.order_id is null) and a.change_type <> 3 INFO: 2008-07-01 10:44:54.577749-05:6000:Insert Complete. INFO: 2008-07-01 10:44:54.583708-05:8000:Executing insert into archive table..... insert into stage_intelliset.order_arch select * from stage_intelliset.order INFO: 2008-07-01 10:44:54.715591-05:8000:Insert complete 2008-07-01 10:43:58 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:43:58 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:43:58 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:43:58 CDT CONTEXT: SQL statement "select * from dblink('gp', $1 ) as t1 (return_text varchar)" PL/pgSQL function "fn_remote_sql" line 63 at SQL statement SQL statement "SELECT gp_execute.fn_remote_sql( $1 , $2 )" PL/pgSQL function "fn_load" line 393 at PERFORM 2008-07-01 10:43:58 CDT LOG: could not send data to client: No connection could be made because the target machine actively refused it. 2008-07-01 10:43:58 CDT CONTEXT: SQL statement "select * from dblink('gp', $1 ) as t1 (return_text varchar)" PL/pgSQL function "fn_remote_sql" line 63 at SQL statement SQL statement "SELECT gp_execute.fn_remote_sql( $1 , $2 )" PL/pgSQL function "fn_load" line 393 at PERFORM 2008-07-01 10:43:58 CDT STATEMENT: select * from gp_load.fn_load(813) 2008-07-01 10:46:30 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:46:30 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:46:30 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:46:30 CDT CONTEXT: PL/pgSQL function "fn_load" line 271 at IF 2008-07-01 10:46:30 CDT LOG: could not send data to client: No connection could be made because the target machine actively refused it. 2008-07-01 10:46:30 CDT CONTEXT: PL/pgSQL function "fn_load" line 271 at IF 2008-07-01 10:46:30 CDT STATEMENT: select * from gp_load.fn_load(757) 2008-07-01 10:46:30 CDT WARNING: terminating connection because of crash of another server process 2008-07-01 10:46:30 CDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2008-07-01 10:46:30 CDT HINT: In a moment you should be able to reconnect to the database and repeat your command. 2008-07-01 10:46:30 CDT CONTEXT: SQL statement "select remote_execute.fn_get_max( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 )" PL/pgSQL function "fn_load" line 265 at SQL statement 2008-07-01 10:46:30 CDT LOG: could not send data to client: No connection could be made because the target machine actively refused it. 2008-07-01 10:46:30 CDT CONTEXT: SQL statement "select remote_execute.fn_get_max( $1 , $2 , $3 , $4 , $5 , $6 , $7 , $8 , $9 , $10 , $11 )" PL/pgSQL function "fn_load" line 265 at SQL statement 2008-07-01 10:46:30 CDT STATEMENT: select * from gp_load.fn_load(965) 2008-07-01 10:46:30 CDT LOG: all server processes terminated; reinitializing 2008-07-01 10:46:31 CDT FATAL: pre-existing shared memory block is still in use 2008-07-01 10:46:31 CDT HINT: Check if there are any old server processes still running, and terminate them. > But: > > > 2008-07-01 10:46:31 CDT FATAL: pre-existing shared memory block is > > still in use > > 2008-07-01 10:46:31 CDT HINT: Check if there are any old server > > processes still running, and terminate them. > > Hmm ... the code in win32_shmem.c that generates this message seems > mighty bogus to me --- it's just hoping that one-second delay is > enough. Another problem is that postmaster children that do > PGSharedMemoryDetach will still have valid inherited handles for > the shmem segment --- does that factor into the behavior? It looks > to me like the CloseHandle ought to be in PGSharedMemoryDetach. > > regards, tom lane This wouldn't be a problem in Unix? Jon