Search Postgresql Archives

Re: Somewhat automated method of cleaning table of corrupt records for pg_dump

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 22.10.2012 22:34, schrieb Martijn van Oosterhout:
Something that has worked for me in the past is:

$ SELECT ctid FROM table WHERE length(field) < 0;

As the structure of the tables (about four were affected) isn't something that I wanted to actually look at, I set off writing a small script in Python which fetches ctids from all tables in the DB, then tries to fetch the corresponding row, and in case that fails, removes it. That worked beautifully for me (it removed somewhere around eight/nine rows from the DB), and I can now dump the whole database correctly. Running the script took around 48 hours for the complete database.

It's not pretty, but works, and for anybody else finding themselves in this situation, here's the code:

--------------------------------
# -*- coding: utf-8 -*-

# Imports
# -------

import psycopg2


# Open database
# -------------

conn = psycopg2.connect("dbname=<name of DB> user=postgres")


# Select tables
# -------------

cur = conn.cursor()
cur.execute("SELECT relname FROM pg_class WHERE relkind = 'r' AND relnamespace = 2200")
TABS = cur.fetchall()
cur.close()


# Walk tables
# -----------

for tab in TABS:
    # Fetching from tab.
    print "Fetching from:", tab[0]

    # Fetch cursor.
    cur = conn.cursor()
    cur.execute("SELECT ctid FROM %s" % (tab[0],))
    stmt = "SELECT * FROM %s WHERE ctid = %%s" % (tab[0],)
    dstmt = "DELETE FROM %s WHERE ctid = %%s" % (tab[0],)
    todelete = []
    while True:
        # Fetch an element.
        ctid = cur.fetchone()
        if ctid is None:
            print "Done with:", tab[0]
            break

        # Fetch complete element.
        curr = conn.cursor()
        try:
            curr.execute(stmt, ctid)
            curr.fetchone()
            curr.close()
        except:
            # Failed to fetch, clean any pending transaction.
            print "Failed to fetch:", ctid[0]
            curr.close()
            cur.close()
            conn.rollback()

            # Create new cursor on connection.
            curr = conn.cursor()
            curr.execute(dstmt, ctid)
            conn.commit()
            curr.close()

            # Reset cursor.
            cur = conn.cursor()
            cur.execute("SELECT ctid FROM %s" % (tab[0],))
--------------------------------

Thanks again for all the hints!

--
--- Heiko.


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux