Search Postgresql Archives

Re: Large Database \d: ERROR: cache lookup failed for relation ...

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I'm working with these guys to resolve the immediate issue, but I suspect there's a race condition somewhere in the code.

What's happened is that OIDs have been changed in the system. There's not a lot of table DDL that happens, but there is a substantial amount of view DDL that can take place. In a nutshell, tables will sometimes have fields added to them, and when that happens a whole set of views needs to be re-created to take the new fields into account.

The files for corrupted tables do exist; this seems to be mostly a catalog corruption issue. I'm seeing both what appear to be inconsistencies between relcache and the catalog tables as well as corruption between tables themselves:

emma2=# select * from userdata_8464_campaigns;
ERROR:  could not open relation with OID 138807643
emma2=# \d userdata_8464_campaigns
Table "public.userdata_8464_campaigns" Column | Type | Modifiers -------------------------------+----------------------------- +------------------------------------------------------------------ campaign_id | bigint | not null default nextval(('emma_campaigns_seq'::text)::regclass)
account_id               | bigint                      | not null
cep_object_id | bigint | not null default nextval(('cep_object_seq'::text)::regclass)
campaign_name            | character varying(255)      | not null
campaign_subject         | character varying(255)      | not null
layout_page_id           | bigint                      | not null
layout_content_id        | bigint                      | not null
campaign_create_date | timestamp without time zone | not null default now() campaign_last_mod_date | timestamp without time zone | not null default now()
campaign_status          | character varying(50)       | not null
campaign_parent_id       | bigint                      |
published_campaign_id    | bigint                      |
campaign_plaintext       | text                        |
campaign_plaintext_ds    | timestamp without time zone |
delivery_old_score       | double precision            |
campaign_person_defaults | text                        |
Inherits: emma_campaigns

select oid from pg_class where relname='userdata_8464_campaigns';
  oid
--------
533438
(1 row)

And that file actually does exist on disk...

select * from pg_index where indexrelid=138807643;
indexrelid | indrelid | indnatts | indisunique | indisprimary | indisclustered | indisvalid | indkey | indclass | indexprs | indpred ------------+----------+----------+-------------+-------------- +----------------+------------+--------+----------+----------+--------- 138807643 | 533438 | 1 | t | t | f | t | 1 | 1980 | |
(1 row)

select * from pg_class where oid=138807643;
relname | relnamespace | reltype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | reltoastrelid | reltoastidxid | relhasindex | relisshared | relkind | relnatts | relchecks | reltriggers | relukeys | relfkeys | relrefs | relhasoids | relhaspkey | relhasrules | relhassubclass | relfrozenxid | relacl | reloptions ---------+--------------+---------+----------+-------+------------- +---------------+----------+-----------+--------------- +---------------+-------------+-------------+---------+---------- +-----------+-------------+----------+----------+--------- +------------+------------+-------------+---------------- +--------------+--------+------------
(0 rows)

On Jun 5, 2007, at 11:27 AM, Erik Jones wrote:

I originally sent this message from my gmail account yesterday as we were having issues with our work mail servers yesterday, but seeing that it hasn't made it to the lists yet, I'm resending from my registered address. You have my apologies if you receive this twice.

"Thomas F. O'Connell" <tf ( at ) o ( dot ) ptimized ( dot ) com> writes:
> I'm dealing with a database where there are ~150,000 rows in

> information_schema.tables. I just tried to do a \d, and it came back
> with this:

> ERROR:  cache lookup failed for relation [oid]

> Is this indicative of corruption, or is it possibly a resource issue?

Greetings,

This message is a follow-up to Thomas's message quoted above (we're working together on the same database). He received one response when he sent the above message which was from Tom Lane and can be easily summarized as him having said that that could happen tables were being created or dropped while running the \d in psql. Unfortunately, that wasn't the case, we have now determined that there is some corruption in our database and we are hoping some of you back-end gurus might have some suggestions.

How we verified that there is corruption was simply to reindex all of our tables in addition to getting the same errors when running a dump this past weekend. We so far have a list of five tables for which reindex fails with the error: "ERROR: could not open relation with OID xxxx" (sub xxxx with the five different #s) and one that fails reindexing with "ERROR: xxxxx is an index" where is an index on a completely different table. After dropping all of the indexes on these tables (a couple didn't have any to begin with), we still cannot run reindex on them. In addition, we can't drop the tables either (we get the same errors). We can however run alter table statements on them. So, we have scheduled a downtime for an evening later this week wherein we plan on bringing the database down for a REINDEX SYSTEM and before that we are going to run a dump excluding those tables, restore that on a separate machine and see if these errors crop up there anywhere. Is there anything else anyone can think of that we can do to narrow down where the actual corruption is or how to fix it?

Erik Jones

Software Developer | Emma®
erik@xxxxxxxxxx
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com



---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match


--
Jim Nasby                                            jim@xxxxxxxxx
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux