Well, we already have the schema in a canned file. So, we wouldn't
need to do a pg_dump for the schema, and I can get into the database
using psql obviously. That's how I produced the setting from
pg_settings. But, I really don't want to have to go through 150K+
records to find the multiple broken ones, if I don't have to do so.
I'm just hoping for a faster way.
On 7/17/2014 3:30 PM, Scott Whitney
wrote:
Well, at this point, let me let the big brains answer your
main question on the thread and tell you what _I_ did when I
got into a similar situation.
a) pg_dump --schema-only <database name> >
db.structure
b) From here it was a lot of select * from table until I
found the broken one(s) and I worked around the actual bad
data by offset/limit on the selects on the busted tables.
It took me several hours, but it did work (for me).
That's assuming you can even psql to the db in the first
place when the postmaster is running.
We're only working with a single database on this system, and
yes, I attempted a pg_dump earlier today on just one single
table - it failed (I know pg_dump and pg_basebackup use
different mechanisms.) Mind you it's a large table with
154,000 rows in it. The overall database is somewhere around
43GB.
On 7/17/2014 3:16 PM, Scott
Whitney wrote:
a) How many databases do you have?
b) Have you tried to pg_dump the database(s) to see
if that succeeds?
If you _can_ pg_dump (which you might or might not
be able to do), you could re-init the cluster and
restore.
Hi
all,
You may have seen my post from yesterday about our
production database getting corrupted. Well, this
morning we brought the system down to single user and
ran an fsck which did
report some drive errors. We repeated until no
additional errors were reported. Then, we brought the
system back to multi-user status and ran a successful
pg_basebackup on the
broken database. Since then we restarted the database
and a ps -ef result looks like:
/usr/pgsql-9.2/bin/postmaster -D /opt/datacenter -o -c
zero_damaged_pages=true -i -N 384 -p 5431
After the Db started up, we ran a VACUUM FULL ANALYZE
which ran for about 3 hours, but the database is still
showing the same type of errors in its log: invalid page
header in
block 29718... etc. What disturbed me a little, is that
I don't think the zero_damaged_pages got applied.
Checking the pg_settings table, we got:
select name, setting, boot_val, reset_val from
pg_settings where name = 'zero_damaged_pages';
name | setting |
boot_val | reset_val
---------------------------------------------------------------------------------
zero_damaged_pages | on | off
| on
Now, my colleague ran this after he tried running some
operations again after I told him how to set
zero_damaged_pages again. He swears that that it was on
when the first VACUUM
FULL ANALYZE was run, but I'm not as sure. Plus, I don't
understand why the boot_val shows as off. In any event,
as we're still getting log errors like before, I don't
really know
what to try next other than rerunning the VACUUM FULL
again. Help?
--
Jay
--
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
|