2010/4/22 Achilleas Mantzios <achill@xxxxxxxxxxxxxxxxxxxxx>: > Στις Thursday 22 April 2010 16:53:05 ο/η Cédric Villemain έγραψε: >> 2010/4/22 Achilleas Mantzios <achill@xxxxxxxxxxxxxxxxxxxxx>: >> > Hello, >> > i have this serious problem in one of our remote vessels. (comm is done by minicom to the remote satelite modem) >> > I think that this server was under some sort of constant resets or hardware failures. >> > Initially,i had this problem: >> > ERROR: invalid page header in block 672720 of relation "pg_toast_125716009" >> > >> > This toast table corresponds to a table named "mail_message", >> > Table "public.mail_message" >> > Column | Type | Modifiers >> > -----------+-------------------+----------------------------------------------------------- >> > msgno | mail_msgno_domain | not null default nextval('mail_msgno_sequence'::regclass) >> > msgsource | bytea | >> > Indexes: >> > "mail_message_key" PRIMARY KEY, btree (msgno) >> > >> > (obviously the TOAST table serves the msgsource varlena) the contents of which is not of vital importance. >> > I tried, REINDEXING, with no success, and after that, i tried resetting the said block on disk as per this >> > suggestion by Tom here: http://old.nabble.com/invalid-page-header-td11981154.html >> > >> > i found the oid of the table: >> > SELECT tableoid,1 from pg_toast.pg_toast_125716009 order by chunk_id LIMIT 1 >> > tableoid | ?column? >> > -----------+---------- >> > 125716013 | 1 >> > >> > (and just to verify) >> > SELECT relname from pg_class where oid=125716013; >> > relname >> > -------------------- >> > pg_toast_125716009 >> > >> > Then i did: (as i said i do not need the contents of msgsource - yet the contents of msgno are vital) >> > dd if=/dev/zero of=/usr/local/var/lib/pgsql/data/base/125714957/125716013 seek=672720 bs=8192 count=1 >> >> segment have 1.1GB size maximum. You have to catch in what segment the >> faulty block is, and reajust the block value from the error report to >> the real one in the good segment. >> > > Thanx, > Taking a look at /usr/local/src/postgresql-8.3.3/src/backend/storage/smgr/md.c > i see the the error comes from function mdnblocks > > if (nblocks > ((BlockNumber) RELSEG_SIZE)) > elog(FATAL, "segment too big"); > > That means, that some segment file is bigger than RELSEG_SIZE > At least in my system: > #define BLCKSZ 8192 > #define RELSEG_SIZE (0x40000000 / BLCKSZ) > So, any segment file cannot be bigger than RELSEG_SIZE blocks (or 2^30 bytes = 1GB) > > Currently i dont have any access to the machine but tomorrow i will check the file sizes. > > Can anyone shed some light as to some method of identifying all the segment files of a table? > The first one has the same name as the tableoid. > How about the subsequent segments? Your execution of dd make your first segment bigger than expected. Other segment have the same name with a .1 .2 etc suffix. You have to shrink your first segment to the correct size. check what happens, you should have now the original error. And, I have never used it, but I think it is the purpose of zero_damaged_pages to parameter to allow postgresql itself to zero the bad black. (reading src/backend/storage/buffer/bufmgr.c confirm that. *BUT* take care that it will zero *every* bad page, perhaps not only the one trapping an error. In those situation, it is good to make a snapshot of the pgdata directory, in case your fingers surf too fast on the keyboard.... If you don't want to activate zero_damage_page, then go and calculate which block in which segment you have to zeroing. side note, it may be usefull to have the relevant information in the error message... > >> > >> > However, after that, unfortunately i get constant postgresql server restarts with: >> > FATAL: segment too big >> > server closed the connection unexpectedly >> > This probably means the server terminated abnormally >> > before or while processing the request. >> > The connection to the server was lost. Attempting reset: Succeeded. >> > >> > Is there anything i can do to savage the situation? >> > >> > (one of) the hard part here is that i dont have neither physical nor network access to the server >> > (only ultra expensive unreliable satellite comms) >> > >> > Thanks for any hints... >> > >> > -- >> > Achilleas Mantzios >> > >> > -- >> > Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) >> > To make changes to your subscription: >> > http://www.postgresql.org/mailpref/pgsql-admin >> > >> >> >> > > > > -- > Achilleas Mantzios > -- Cédric Villemain -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin