Search Postgresql Archives

Re: can't start postgresql

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tuesday 14 November 2006 17:34, Alvaro Herrera wrote:
> Matthias.Pitzl@xxxxxx wrote:
> > He's just trying to do what you told him.
> > But the use of initdb won't correct the problem Igor got. There's
> > something
> >
> > wrong with his database so that he can't start the postmaster process:
> > > PANIC:  failed to re-find parent key in "23724"
> >
> > Unfortunately i don't have any idea how to solve this issue :(
>
> I'd think starting a standalone backend and issuing a "reindex database"
> should be enough to get him started.  Now, the problem is figuring _how_
> the index got in that state; or even _what_ index is the problematic
> one.  (I think it would be possible to find out by setting the
> "log_error_verbosity" parameter to "verbose").

The problem is, it won't start even in the standalone mode. It prints the same 
error message and dies. I was trying to start it connected to my main db, 
template0, template1 and postgres, no change. The index corruption (assuming 
this IS an index corruption) must've happened with system table(s). Is there 
any way to diagose that ? Maybe reindex a specific table without starting 
postmaster, or remove certain index files physically (e.g. for pg_class, 
etc) ?

> If it's a hardware problem, one would think it deserves some diagnosis.
> It could be the btree bug Tom fixed last week, but I'm not sure if this
> is really a consequence of it.

Can't rule out an HW problem. I'd think this IS an HW problem, but we've 
started using an updated version of our software on this server a week ago, 
and it creates/drops alot of tables, temporary and not (kind of in-house 
table partitioning), so this could be an (exceptionally rare) pgsql bug with 
pg_class / pg_index / etc handling. A side question: is it safe to 
create/drop tables from pl/pgsql functions, called directly or via a 
trigger ?

Our previous version of the software was using a light version of what we're 
trying to do now, and there was / maybe still a problem:

We're processing data in chunks, and each chunk is completely separate from 
others, so using a special table per chunk seems to be a natural thing to do. 
We keep these in a separate tablespace "w". There's one non-unique 
(integer,varchar) index per each such table; both fields are non-null. Both 
table and index are created via a pl/pgsql function. An index is created 
after the data import, during the same transaction:

a) begin
b) select create_w_table(jobid,false) -- false means don't create index
c) insert into w.tableX select *....
d) select create_w_table(jobid,true) -- true means create index
e) commit

Both table and index are dropped at the same time, using another pl/pgsql 
function.

create_w_table is rarely called from yet another pl/pgsql function.

Everything's working fine, except that sometimes an index can, um, "go off the 
rails":

* not listed by "psql"'s \d w.tableX
* queries are not using it
* it's still listed in the pg_class (seen by "select * from pg_class where 
relname='tblX_idx'")
* a corresponding "pg_index" row references a non-existing pg_class entry via 
pg_index.indrelid

I'm seeing this several times a week, for over 5 months now. Here's my fix 
procedure:

* fetch a list of probematic indexes:

select pc.relname,pc.oid,pc_idx.oid from pg_class pc left join pg_class pc_idx 
on (pc_idx.relname=pc.relname || '_idx') where pc.relname ~ '^table\\d+$' and 
(pc_idx.oid is NULL OR not exists (select 1 from pg_index where 
pg_index.indexrelid=pc_idx.oid and indrelid=pc.oid))

* fix each with:

update pg_index set indrelid=${right one from pg_class} where 
indexrelid=${index oid from pg_class};
update pg_class set relhasindex='t' where oid=${table oid}; -- AFAICR this is 
redundant, as relhasindex is still true

now I can drop this index (usually it's out of sync), vacuum the table and 
recreate index:

drop index w.tableX_idx;
vacuum full verbose w.tableX;
create index tableX_idx on w.tableX(...);
analyze w.tableX;


I've seen this on pgsql versions from v8.1.2 till v8.1.4, can't confirm with 
v8.1.5 yet. Autovacuum's on; daily "vacuum verbose analyze" and "reindex" are 
ran over pg_class, pg_index, a bunch of our own small tables, and recently 
over pg_depend, pg_type and pg_statistic.

-- 
Best Regards,
Igor Shevchenko


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux