Have you turned off autovacuum on the first db node before starting online recovery? -- Tatsuo Ishii SRA OSS, Inc. Japan > Hello, > > today I recognized an error while playing with pgpool-II 2.1 and > postgresql 8.3.5 (on 64bit linux). > I really don't know if the error is caused by pgpool or if it's > postgresql-related...?!? > > There are up to 200 active connections in parallel and thereby sometimes > high load (with high write:read ratio, peak level ~60% write). > Can this be a problem in such a usage scenario, are there some > limitations in pgpool-II <-> number of connections / load ??? > > any hints are much appreciated, since we don't know what happened... > > thanks in advance...::GERD::... > > > > LOG DETAILS: > > a) postgresql log of second db node where the error occured > b) pgpool log > > > a) > ================================================================== > 2009-03-10 11:16:33 CET [11008]: [1-1] PANIC: right sibling's left-link > doesn't match: block 695 links to 785 instead of expected 694 in index > "PartyAddress_index" > 2009-03-10 11:16:33 CET [11008]: [2-1] STATEMENT: INSERT INTO > "PartyAddress" ("Name1", "Name2", "Street1", "Street2", " > City", "Postcode", "Region", "Country", "Timezone", "Qualifier", > "Compressed", "PositionId") VALUES (E'Testing', E'', E' > Public street', E'', E'somewhere', E'12345', E'-', E'DE', E'UTC+1', > E'address.loading', FALSE, NULL); select lastval() > 2009-03-10 11:16:33 CET [29026]: [3-1] LOG: server process (PID 11008) > was terminated by signal 6: Aborted > 2009-03-10 11:16:33 CET [29026]: [4-1] LOG: terminating any other > active server processes > 2009-03-10 11:16:33 CET [10978]: [5-1] WARNING: terminating connection > because of crash of another server process > 2009-03-10 11:16:33 CET [10978]: [6-1] DETAIL: The postmaster has > commanded this server process to roll back the curren > t transaction and exit, because another server process exited abnormally > and possibly corrupted shared memory. > 2009-03-10 11:16:33 CET [10978]: [7-1] HINT: In a moment you should be > able to reconnect to the database and repeat you > r command. > 2009-03-10 11:16:33 CET [10947]: [1-1] WARNING: terminating connection > because of crash of another server process > ..... > ..... > 2009-03-10 11:16:33 CET [29026]: [6-1] LOG: all server processes > terminated; reinitializing > 2009-03-10 11:16:33 CET [11015]: [1-1] LOG: database system was > interrupted; last known up at 2009-03-10 11:09:10 CET > 2009-03-10 11:16:33 CET [11015]: [2-1] LOG: starting archive recovery > 2009-03-10 11:16:33 CET [11015]: [3-1] LOG: restore_command = 'cp > /exchange/wal_from_iweblin/%f %p' > cp: cannot stat `/exchange/wal_from_iweblin/00000003.history': No such > file or directory > cp: cannot stat `/exchange/wal_from_iweblin/0000000300000001000000AA': > No such file or directory > 2009-03-10 11:16:33 CET [11015]: [4-1] LOG: automatic recovery in progress > 2009-03-10 11:16:33 CET [11015]: [5-1] LOG: redo starts at 1/AABEBFA8 > 2009-03-10 11:16:33 CET [11015]: [6-1] LOG: unexpected pageaddr > 0/36BEE000 in log file 1, segment 170, offset 12509184 > 2009-03-10 11:16:33 CET [11015]: [7-1] LOG: redo done at 1/AABED938 > 2009-03-10 11:16:33 CET [11015]: [8-1] LOG: last completed transaction > was at log time 2009-03-10 11:15:17.533913+01 > cp: cannot stat `/exchange/wal_from_iweblin/0000000300000001000000AA': > No such file or directory > cp: cannot stat `/exchange/wal_from_iweblin/00000004.history': No such > file or directory > 2009-03-10 11:16:33 CET [11015]: [9-1] LOG: selected new timeline ID: 4 > cp: cannot stat `/exchange/wal_from_iweblin/00000003.history': No such > file or directory > 2009-03-10 11:16:33 CET [11015]: [10-1] LOG: archive recovery complete > 2009-03-10 11:16:33 CET [11015]: [11-1] LOG: checkpoint starting: > shutdown immediate > 2009-03-10 11:16:34 CET [11015]: [12-1] LOG: checkpoint complete: wrote > 4 buffers (0.0%); 0 transaction log file(s) added, 0 removed, 0 > recycled; write=0.005 s, sync=0.086 s, total=0.242 s > 2009-03-10 11:16:34 CET [11023]: [1-1] LOG: autovacuum launcher started > 2009-03-10 11:16:34 CET [29026]: [7-1] LOG: database system is ready to > accept connections > ======================================= > > > > b) > ========================================== > 2009-03-10 11:16:23 DEBUG: pid 19059: read kind from frontend X(58) > 2009-03-10 11:16:31 DEBUG: pid 27107: read kind from frontend Q(51) > 2009-03-10 11:16:31 LOG: pid 27107: statement: SELECT d.oid, datname > as name, pg_get_userbyid(datdba) as owner, ds.des > cription, pg_encoding_to_char(encoding) as dbencoding, datistemplate, > datlastsysoid, datconfig, s.spcname as tblspace, d > atconnlimit AS conlimit FROM pg_database d LEFT OUTER JOIN > pg_description ds ON d.oid = ds.objoid LEFT OUTER JOIN pg_tab > lespace s ON d.dattablespace = s.oid WHERE datname = 'test' > > 2009-03-10 11:16:31 DEBUG: pid 27107: waiting for backend 0 completing > the query > 2009-03-10 11:16:31 DEBUG: pid 27107: read_kind_from_backend: read kind > from 0 th backend T NUM_BACKENDS: 1 > 2009-03-10 11:16:31 DEBUG: pid 27107: pool_process_query: kind from > backend: T > 2009-03-10 11:16:31 DEBUG: pid 27107: read_kind_from_backend: read kind > from 0 th backend D NUM_BACKENDS: 1 > 2009-03-10 11:16:31 DEBUG: pid 27107: pool_process_query: kind from > backend: D > 2009-03-10 11:16:31 DEBUG: pid 27107: read_kind_from_backend: read kind > from 0 th backend C NUM_BACKENDS: 1 > 2009-03-10 11:16:31 DEBUG: pid 27107: pool_process_query: kind from > backend: C > 2009-03-10 11:16:31 DEBUG: pid 27107: read_kind_from_backend: read kind > from 0 th backend Z NUM_BACKENDS: 1 > 2009-03-10 11:16:31 DEBUG: pid 27107: pool_process_query: kind from > backend: Z > 2009-03-10 11:16:31 DEBUG: pid 27107: pool_read_message_length: slot: 0 > length: 5 > 2009-03-10 11:16:31 DEBUG: pid 27107: ReadyForQuery: message length: 5 > 2009-03-10 11:16:31 DEBUG: pid 27107: ReadyForQuery: transaction state: I > 2009-03-10 11:16:31 DEBUG: pid 27107: end_load_balance: end load balance > mode > 2009-03-10 11:16:31 DEBUG: pid 27107: read kind from frontend X(58) > 2009-03-10 11:16:36 DEBUG: pid 27107: I am 27107 accept fd 5 > 2009-03-10 11:16:36 LOG: pid 27107: connection received: > host=172.28.1.126 port=1946 > 2009-03-10 11:16:36 DEBUG: pid 27107: Protocol Major: 1234 Minor: 5679 > database: user: > 2009-03-10 11:16:36 DEBUG: pid 27107: SSLRequest: sent N; retry startup > 2009-03-10 11:16:36 DEBUG: pid 27107: Protocol Major: 3 Minor: 0 > database: test user: test > 2009-03-10 11:16:36 DEBUG: pid 27107: new_connection: connecting 0 backend > 2009-03-10 11:16:36 DEBUG: pid 27107: new_connection: connecting 1 backend > 2009-03-10 11:16:36 DEBUG: pid 27107: new_connection: skipping slot 1 > because backend_status = 3 > _______________________________________________ > Pgpool-general mailing list > Pgpool-general@xxxxxxxxxxxxx > http://pgfoundry.org/mailman/listinfo/pgpool-general -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general