On Mon, Apr 25, 2011 at 9:19 PM, Phoenix Kiula <phoenix.kiula@xxxxxxxxx> wrote: > On Fri, Apr 22, 2011 at 8:35 PM, <tv@xxxxxxxx> wrote: >>> On Fri, Apr 22, 2011 at 8:20 PM, <tv@xxxxxxxx> wrote: >>>>> On Fri, Apr 22, 2011 at 7:07 PM, <tv@xxxxxxxx> wrote: >>>>> In the pg_dumpall backup process, I get this error. Does this help? >>>>> >>>> >>>> Well, not really - it's just another incarnation of the problem we've >>>> already seen. PostgreSQL reads the data, and at some point it finds out >>>> it >>>> needs to allocate 4294967293B of memory. Which is strange, because it's >>>> actually a negative number (-3 AFAIK). >>>> >>>> It's probably caused by data corruption (incorrect length for a field). >>>> >>>> There are ways to find out more about the cause, e.g. here: >>>> >>>> http://archives.postgresql.org/pgsql-hackers/2005-10/msg01198.php >>>> >>>> but you need to have a pg compiled with debug support. I guess the >>>> packaged version does not support that, but maybe you can get the >>>> sources >>>> and compile them on your own. >>>> >>>> If it really is a data corruption, you might try to locate the corrupted >>>> blocks like this: >>>> >>>> -- get number of blocks >>>> SELECT relpages FROM pg_class WHERE relname = 'table_name'; >>>> >>>> -- get items for each block (read the problematic column) >>>> FOR block IN 1..relpages LOOP >>>> SELECT AVG(length(colname)) FROM table_name WHERE ctid >= >>>> '(block,0)'::ctid AND ctid < '(block+1,0)'::ctid; >>> >>> >>> Thanks for this. Very useful. What is this -- a function? How should I >>> execute this query? >> >> It's a pseudocode - you need to implement that in whatever language you >> like. You could do that in PL/pgSQL but don't forget it's probably going >> to crash when you hit the problematic block so I'd probably implement that >> in outside the DB (with a logic to continue the loop once the connection >> dies). >> >> And 'ctid' is a pseudocolumn that means '(block#, row#)' i.e. it's >> something like a physical location of the row. >> >> regards >> Tomas > > > > A question. > > Is data dumped from "COPY TO" command any use? > > It has taken me days, but I have managed to COPY my large table in chunks. > > If I subsequently COPY FROM these files, would this be a workable solution? > > My fear based on my ignorance is that maybe the data corruption, if > any exists, will also get COPY-ied and therefore transferred into the > fresh database. > > Is this fear justified, or is COPY a viable alternative? > > Thanks! > Sorry, spoke too soon. I can COPY individual chunks to files. Did that by year, and at least the dumping worked. Now I need to pull the data in at the destination server. If I COPY each individual file back into the table, it works. Slowly, but seems to work. I tried to combine all the files into one go, then truncate the table, and pull it all in in one go (130 million rows or so) but this time it gave the same error. However, it pointed out a specific row where the problem was: COPY links, line 15272357: "16426447 9s2q7 9s2q7 N http://www.amazon.com/gp/search?camp=1789&creative=9325&ie=UTF8&i..." server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. Is this any use at all? Would appreciate any pointers! -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general