Re: Explains of queries to partitioned tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 26, 2010 at 4:47 AM, Vlad Arkhipov <arhipov@xxxxxxxxxxxx> wrote:
> There is a partitioned table with 2 partitions:
>
> drop table if exists p cascade;
>
> create table p (
>  id bigint not null primary key,
>  ts timestamp);
>
> create table p_actual ( check (ts is null) ) inherits (p);
> create table p_historical ( check (ts is not null) ) inherits (p);
>
> -- I skipped the triggers and rules creation
>
> insert into p (id, ts) values (1, '2000-01-01');
> insert into p (id, ts) values (2, null);
> insert into p (id, ts) values (3, '2001-01-01');
> insert into p (id, ts) values (4, '2005-01-01');
>
> analyze p;
> analyze p_actual;
> analyze p_historical;
>
> Here is the explain output for the query 'select * from p where ts is null'
>
> Result  (cost=0.00..188.10 rows=10 width=16) (actual time=0.028..0.038
> rows=1 loops=1)
>  ->  Append  (cost=0.00..188.10 rows=10 width=16) (actual time=0.023..0.029
> rows=1 loops=1)
>        ->  Seq Scan on p  (cost=0.00..187.00 rows=9 width=16) (actual
> time=0.002..0.002 rows=0 loops=1)
>              Filter: (ts IS NULL)
>        ->  Seq Scan on p_actual p  (cost=0.00..1.10 rows=1 width=16) (actual
> time=0.014..0.016 rows=1 loops=1)
>              Filter: (ts IS NULL)
> Total runtime: 0.080 ms
>
> You can notice that the optimizer expects 10 rows in the table p and as a
> result of this assumption the whole query is estimated as 10 rows. Whether
> it will cause a performance impact further? pg_stats does not contain any
> statistics on the table 'p'. Is this a cause of such behaviour?
> The estimation is worse for some other queries, for example 'select * from p
> where ts is not null'
>
> Result  (cost=0.00..188.30 rows=1764 width=16) (actual time=0.021..0.049
> rows=3 loops=1)
>  ->  Append  (cost=0.00..188.30 rows=1764 width=16) (actual
> time=0.016..0.032 rows=3 loops=1)
>        ->  Seq Scan on p  (cost=0.00..187.00 rows=1761 width=16) (actual
> time=0.003..0.003 rows=0 loops=1)
>              Filter: (ts IS NOT NULL)
>        ->  Seq Scan on p_historical p  (cost=0.00..1.30 rows=3 width=16)
> (actual time=0.008..0.015 rows=3 loops=1)
>              Filter: (ts IS NOT NULL)
> Total runtime: 0.095 ms

It would be easier to comment on this if you mentioned things like
which version of PG you're using, and what you have
constraint_exclusion set to, but as a general comment analyze doesn't
store statistics for any tables that are empty, because it assumes
that at some point you're going to put data in them.  So in this case
p_historical is probably using fake stats.  But it's not clear that it
really matters: you haven't got any relevant indices, so a sequential
scan is the only possible plan; and even if you did have some, there's
only 4 rows, so a sequential scan is probably the only plan that makes
sense anyway.  And your query ran in a tenth of a millisecond, which
is pretty zippy.  So I'm not really sure what the problem is.  If this
isn't the real data, post an example with the real data and ask for
help about that.

http://wiki.postgresql.org/wiki/SlowQueryQuestions

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux