Re: Strange explain on partitioned tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Greg,

First : thank you for you help.



On 22/07/2010 15:32, Greg Smith wrote:
Philippe Rimbault wrote:
I have one thousand inherited tables like this one (with a different check constraint on each) :

The PostgreSQL partitioning system is aimed to support perhaps a hundred inherited tables. You can expect to get poor performance on queries if you create 1000 of them. That's not the cause of your current problem, just pointing out there's a larger design problem here you'll probably have to fix one day.

Right now, there is only 6 inherited tables, but for performance issue, where are testing solutions on more partionned systeme (all work fine except for query with order by).


EXPLAIN ANALYZE
    select
        documents.id,
        documents.num,
        sources.name,
        l.name
    from
        documents,
        locations l,
        sources
    where
        documents.id_source = 113 and
        documents.id_location=l.id and
        documents.id_source=sources.id
    order by
        documents.id desc
    limit 5;

Please don't put your EXPLAIN plans surrounded in " marks; it makes it harder to copy out of your message to analyze them with tools. I put this bad one into http://explain.depesz.com/s/XD and it notes that the "public.documents.id_location = l.id" search is underestimating the number of rows by a factor of 8.7. You might get a better plan if you can get better table statistics on that column. Did you run ANALYZE since the partitioning was done? If not, that could be making this worse. You might increase the amount of table statistics on this specific column too, not sure what would help without knowing exactly what's in there.

Another thing you can try is suggest the optimizer not use a hash join here and see if it does the right thing instead; be a useful bit of feedback to see what that plan turns out to be. Just put "set enable_hashjoin=off;" before the rest of the query, it will only impact that session.


Sorry for the output of the EXPLAIN ...

VACUUM ANALYZE have been done just before test of query.

I think that the optimizer overestimates "public.documents.id_location = l.id" because it plan on the primary table and not the child ... I've change statistics to 1000 for documents.id_location and result is the same.

I've tested "set enable_hashjoin=off;" and the result is worst (sorry i'm still using 9.0b3) :

Limit (cost=197755.49..197755.50 rows=5 width=23) (actual time=4187.148..4187.150 rows=5 loops=1) -> Sort (cost=197755.49..200390.32 rows=1053932 width=23) (actual time=4187.146..4187.147 rows=5 loops=1)
        Sort Key: public.documents.id
        Sort Method:  top-N heapsort  Memory: 17kB
-> Nested Loop (cost=151258.55..180250.06 rows=1053932 width=23) (actual time=1862.214..3769.611 rows=1053929 loops=1) -> Index Scan using pk_sources on sources (cost=0.00..8.27 rows=1 width=8) (actual time=0.007..0.013 rows=1 loops=1)
                    Index Cond: (id = 113)
-> Merge Join (cost=151258.55..169702.47 rows=1053932 width=21) (actual time=1862.204..3360.555 rows=1053929 loops=1)
                    Merge Cond: (l.id = public.documents.id_location)
-> Sort (cost=1.75..1.81 rows=23 width=10) (actual time=0.028..0.036 rows=21 loops=1)
                          Sort Key: l.id
                          Sort Method:  quicksort  Memory: 17kB
-> Seq Scan on locations l (cost=0.00..1.23 rows=23 width=10) (actual time=0.002..0.009 rows=23 loops=1) -> Materialize (cost=151256.80..156526.46 rows=1053932 width=13) (actual time=1862.162..2841.302 rows=1053929 loops=1) -> Sort (cost=151256.80..153891.63 rows=1053932 width=13) (actual time=1862.154..2290.881 rows=1053929 loops=1)
                                Sort Key: public.documents.id_location
                                Sort Method:  external merge  Disk: 24496kB
-> Append (cost=0.00..27810.36 rows=1053932 width=13) (actual time=0.003..838.644 rows=1053929 loops=1) -> Seq Scan on documents (cost=0.00..18.25 rows=3 width=39) (actual time=0.000..0.000 rows=0 loops=1)
                                            Filter: (id_source = 113)
-> Seq Scan on documents_mond documents (cost=0.00..27792.11 rows=1053929 width=13) (actual time=0.002..502.345 rows=1053929 loops=1)
                                            Filter: (id_source = 113)
Total runtime: 4198.703 ms



--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux