On Fri, Oct 15, 2010 at 10:22 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
Alvaro Herrera <alvherre@xxxxxxxxxxxxxxxxx> writes:The latter.
> If we analyze the parent, do we also update the children stats, or is it
> just that we keep two stats for the parent, one with children and one
> without, both being updated when the parent is analyzed?
The trick here is that we need to fire an analyze on the parent even
though only its children may have had any updates.
> If the latter's the case, maybe we should modify ANALYZE a bit more, so
> that we can analyze the whole hierarchy in one go, and store the lot of
> stats with a single pass (each child alone, the parent alone, the parent
> plus children). However it's not real clear how would this work with
> multiple inheritance levels.
An issue with automatically analyzing the entire hierarchy is 'abstract' table definitions. I've got a set of tables for storing the same data at different granularities of aggregation. Within each granularity, I've got partitions, but because the set of columns is identical for each granularity, I've got an abstract table definition that is inherited by everything. I don't need or want statistics kept on that table because I never query across the abstract table, only the parent table of each aggregation granularity
create table abstract_fact_table (
time timestamp,
measure1 bigint,
measure2 bigint,
measure3 bigint,
fk1 bigint,
fk2 bigint
);
create table minute_scale_fact_table (
} inherits abstract_fact_table;
// Then there are several partitions for minute scale data
create table hour_scale_fact_table (
) inherits abstract_fact_table;
// then several partitions for hour scale data
etc. I do run queries on the minute_scale_fact_table and hour_scale_fact_table but never do so on abstract_fact_table. I could certainly modify my schema such that the abstract table goes away entirely easily enough, but I find this easier for new developers to come in and comprehend, since the similarity between the table definitions is explicit.
I'm glad this topic came up, as I was unaware that I need to run analyze on the parent partitions separately - and no data is every inserted directly into the top level of each granularity hierarchy, so it will never fire by itself.
If I am using ORM and I've got functionality in a common baseclass in the source code, I'll often implement its mapping in the database via a parent table that the table for any subclass mapping can inherit from. Again, I have no interest in maintaining statistics on the parent table, since I never query against it directly.