This should be the operating system OOM kills pg process,check syslog On Fri 30 Aug 2013 05:10:42 PM CST, 高健 wrote:
Hello: I have done the following experiment to test : PG's activity when dealing with data which is bigger in size than total memory of the whole os system. The result is: PG says: ---------------------------- WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. The connection to the server was lost. Attempting reset: Failed. !> ----------------------------- In log, I can see the following: LOG: background writer process (PID 3221) was terminated by signal 9: Killed But why it need to kill the bgwriter ? And that PG will use as much resource as it can to finish a sql dml , till it used almost all the resource and then crash? I used the default checkpoint_segments(3) and shard_buffers(32MB) settings. And my machine has only 1024MB memory. The process is as : 1) create table whose one tuple is more than 1KB: postgres=# create table test01(id integer, val char(1024)); 2) insert into the table 2457600 records, which makes totally more than 2400MB: postgres=# insert into test01 values(generate_series(1,2457600),repeat( chr(int4(random()*26)+65),1024)); It really took a few minutes, When the sql statement run, I can see that the server process is consuming 80% of total memory of the os. ---------------------------------------------------------------------------------------------------------------------------------- [root@server ~]# ps aux | grep post root 3180 0.0 0.0 105296 712 pts/1 S 16:31 0:00 su - postgres postgres 3181 0.0 0.0 70304 676 pts/1 S+ 16:31 0:00 -bash postgres 3219 0.0 0.2 113644 2864 pts/1 S 16:32 0:00 /usr/local/pgsql/bin/postgres -D /gao/data postgres 3221 0.4 3.0 113724 35252 ? Ss 16:32 0:01 postgres: writer process postgres 3222 0.2 0.1 113644 1616 ? Ds 16:32 0:00 postgres: wal writer process postgres 3223 0.0 0.0 114380 1148 ? Ss 16:32 0:00 postgres: autovacuum launcher process postgres 3224 0.0 0.0 73332 472 ? Ss 16:32 0:00 postgres: stats collector process root 3252 0.0 0.0 105296 712 pts/2 S 16:32 0:00 su - postgres postgres 3253 0.0 0.0 70304 676 pts/2 S 16:32 0:00 -bash postgres 3285 0.0 0.0 83488 740 pts/2 S+ 16:32 0:00 ./psql postgres 3286 14.8 80.2 2598332 924308 ? Ds 16:32 0:35 postgres: postgres postgres [local] INSERT root 3333 0.0 0.0 65424 812 pts/3 S+ 16:36 0:00 grep post -------------------------------------------------------- After a while, I found in the PG's log , the following information: -------------------------------------------------------- LOG: autovacuum launcher started LOG: database system is ready to accept connections LOG: checkpoints are occurring too frequently (2 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (1 second apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (2 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (1 second apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (2 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (1 second apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (2 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (4 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (9 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (7 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (9 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (7 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (5 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (5 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (10 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (11 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (15 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (23 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (7 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (8 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (13 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (7 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (8 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (8 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (8 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (4 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (5 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: checkpoints are occurring too frequently (9 seconds apart) HINT: Consider increasing the configuration parameter "checkpoint_segments". LOG: background writer process (PID 3221) was terminated by signal 9: Killed LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. LOG: all server processes terminated; reinitializing FATAL: the database system is in recovery mode LOG: database system was interrupted; last known up at 2013-08-30 16:36:42 CST LOG: database system was not properly shut down; automatic recovery in progress LOG: consistent recovery state reached at 0/B7657BD0 LOG: redo starts at 0/B60FE2B8 LOG: unexpected pageaddr 0/B044C000 in log file 0, segment 184, offset 4505600 LOG: redo done at 0/B844B940 LOG: autovacuum launcher started LOG: database system is ready to accept connections ------------------------------------------------------------------------------------- Best Regards
-- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general