Can you run db wide vacuums? That's what I'd try first. On Fri, May 18, 2012 at 10:41 AM, Little, Douglas <DOUGLAS.LITTLE@xxxxxxxxxx> wrote: > Hello, > > Apologies in advance for asking here and not on the greenplum forum, but > I’m asking here because I think the PG community understands pg better than > Greenplum. > > > > We run Greenplum 4.1 which is based on PG 8.2.15 > > We recently had a problem where the transaction ID overflowed an internal > buffer, causing queries to hang and randomly fail. > > Greenplum provided a fix where they increased the size of the buffer. > > Greenplum has disabled the Autovacuum features. I suspect because they have > no method of coordinating the autovacuums across the nodes. > > > > I’ve reviewed the PG doc on transaction id and > > I’m trying to understand if there’s something we are doing wrong. > > Practices concerning me > > 1. We never vacuum the entire db – we’re doing the system catalog > tables 3x/day and user tables at least 1x/w- although we don’t have a method > of identifying tables that have not been vacuumed so it’s possible we’re > missing some table. > > 2. In the pg_clog directories we have 25 files dating back to aug > 2011. When the instances start, the logs are filled with twophase location > message. Are these expected? > > 3. Is the overflow really being caused by transaction wrap around? > > > > Would someone comment on these concerns. > > Thanks > > > > > > Guk setting > > vacuum_freeze_min_age 100,000,000 Minimum age at which > VACUUM should freeze a table row. > > > > > > Log failure message > > > > 2012-05-10 16:26:24.425750 > CDT,"cognos_cube","p1gp1",p25497,th2121975200,"172.28.8.250","18959",2012-05-10 > 16:25:03 > CDT,100000245,con2170,,seg24,,,x100000245,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 99999948",,,,,,,0,,"twophase.c",1368, > > 2012-05-10 16:26:24.739536 > CDT,"infa_mktldr","p1gp1",p27945,th2121975200,"172.28.8.250","27576",2012-05-10 > 16:26:15 > CDT,100000246,con2240,,seg24,,,x100000246,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 100000230",,,,,,,0,,"twophase.c",1368, > > 2012-05-10 16:26:33.859255 > CDT,"infa_read","p1gp1",p27981,th2121975200,"172.28.8.250","27866",2012-05-10 > 16:26:18 > CDT,100000256,con2244,,seg24,,,x100000256,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 100000236",,,,,,,0,,"twophase.c",1368, > > 2012-05-10 16:26:38.564872 > CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10 > 16:26:06 > CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"LOG","00000"," > > Unexpected internal error: Segment process 27416 received signal SIGSEGV > > > > ",,,,,,,0,,,, > > 2012-05-10 16:26:41.359615 > CDT,"ods","p1gp1",p27416,th2121975200,"172.28.8.250","23337",2012-05-10 > 16:26:06 > CDT,100000257,con2212,,seg24,,dx4860103,x100000257,sx1,"PANIC","XX000","Unexpected > internal error: Segment process received signal SIGSEGV > (postgres.c:3384)",,,,,,,0,,"postgres.c",3384,"Stack trace: > > 1 0xa39555 postgres errstart (elog.c:454) > > 2 0x8e26e5 postgres <symbol not found> (postgres.c:3380) > > 3 0x38c70302d0 libc.so.6 <symbol not found> (??:0) > > 4 0x8b1578 postgres ProcArrayAdd (procarray.c:149) > > 5 0x5258a5 postgres EndPrepare (twophase.c:1240) > > 6 0x4f3462 postgres <symbol not found> (xact.c:3094) > > 7 0x4f3a75 postgres CommitTransactionCommand (xact.c:3523) > > 8 0xb944a5 postgres performDtxProtocolCommand (cdbtm.c:3891) > > 9 0x8e6552 postgres PostgresMain (postgres.c:1414) > > 10 0x8516b1 postgres <symbol not found> (postmaster.c:6443) > > 11 0x8589a5 postgres PostmasterMain (postmaster.c:2272) > > 12 0x76713a postgres main (main.c:212) > > 13 0x38c701d994 libc.so.6 __libc_start_main (??:0) > > 14 0x475089 postgres <symbol not found> (??:0) > > " > > 2012-05-10 16:26:46.580196 > CDT,,,p25872,th2121975200,,,,0,,,seg-1,,,,,"LOG","00000","server process > (PID 27416) was terminated by signal 6: > Aborted",,,,,,,0,,"postmaster.c",5275, > > > > Log messages regarding twophase > > 2012-05-18 00:05:49.849381 > CDT,"infa_write","p1gp1",p15767,th-1760766560,"172.28.8.250","33445",2012-05-18 > 00:05:29 > CDT,104342361,con385826,,seg0,,,x104342361,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342242",,,,,,,0,,"twophase.c",1368, > > 2012-05-18 00:05:50.018299 > CDT,"infa_write","p1gp1",p15887,th-1760766560,"172.28.8.250","34675",2012-05-18 > 00:05:32 > CDT,104342362,con385833,,seg0,,,x104342362,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342257",,,,,,,0,,"twophase.c",1368, > > 2012-05-18 00:05:50.029302 > CDT,"infa_write","p1gp1",p15839,th-1760766560,"172.28.8.250","34335",2012-05-18 > 00:05:30 > CDT,104342363,con385830,,seg0,,,x104342363,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342250",,,,,,,0,,"twophase.c",1368, > > 2012-05-18 00:05:52.274591 > CDT,"infa_write","p1gp1",p15959,th-1760766560,"172.28.8.250","35303",2012-05-18 > 00:05:36 > CDT,104342369,con385840,,seg0,,,x104342369,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342277",,,,,,,0,,"twophase.c",1368, > > 2012-05-18 00:05:52.277127 > CDT,"infa_write","p1gp1",p15478,th-1760766560,"172.28.8.250","31184",2012-05-18 > 00:05:24 > CDT,104342370,con385813,,seg0,,,x104342370,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342346",,,,,,,0,,"twophase.c",1368, > > 2012-05-18 00:05:52.280285 > CDT,"infa_write","p1gp1",p15535,th-1760766560,"172.28.8.250","31304",2012-05-18 > 00:05:25 > CDT,104342371,con385816,,seg0,,,x104342371,sx1,"LOG","00000","FinishPreparedTransaction: > found TwoPhaseState entry for 104342347",,,,,,,0,,"twophase.c",1368, > > > > Thanks > > Doug Little > > > > -- To understand recursion, one must first understand recursion. -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin