Search Postgresql Archives

Re: Query planner riddle (array-related?)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Markus <m@xxxxxxx> writes:
> I'm running Postgresql 9.6.7 on Debian stretch -- and I'm trying to
> understand a query plan, with any hint where to gain further insight
> welcome.

Well, you say

>   select count(*) from gaia.dr2light where parallax>50;
> gives 5400 rows in no time.

but the planner thinks there are 12991627 such rows:

>          ->  Bitmap Heap Scan on dr2light  (cost=243173.69..25288015.74 rows=12991627 width=132)
>                Recheck Cond: (parallax > '50'::double precision)
>                ->  Bitmap Index Scan on dr2light_parallax  (cost=0.00..239925.78 rows=12991627 width=0)
>                      Index Cond: (parallax > '50'::double precision)

So my first instinct would be to try to get that estimate more in
line with reality.  Maybe you need to increase the statistics target
for that column.

Also, this sort of thing is usually much easier to diagnose from
EXPLAIN ANALYZE output.  All we can see from these queries is that
the planner picked what it thought was the lowest-cost plan.  Without
actual rowcounts it's very hard to guess why the estimates were wrong.
You happened to provide one actual-rowcount number that maybe was
enough to diagnose the issue; but if the above doesn't do the trick,
we're going to need to see EXPLAIN ANALYZE to guess what else is up.

			regards, tom lane




>    ->  Hash  (cost=118285.38..118285.38 rows=551038 width=1523)
>          ->  Seq Scan on dr2epochflux  (cost=0.00..118285.38 rows=551038 width=1523)


> And here's the bad plan (for query 1):

> --------------------------------------------------------------------------------------
>  Nested Loop  (cost=0.58..4801856.96 rows=4229 width=1647)
>    ->  Seq Scan on dr2epochflux  (cost=0.00..118285.38 rows=551038 width=1523)
>    ->  Index Scan using dr2light_pkey on dr2light  (cost=0.58..8.49 rows=1 width=132)
>          Index Cond: (source_id = dr2epochflux.source_id)
>          Filter: (parallax > '50'::double precision)

> If I enable_seqscan=0, it comes up with this for query 1:

> ---------------------------------------------------------------------------------------------------------
>  Nested Loop  (cost=1.00..4810726.18 rows=4229 width=1647)
>    ->  Index Scan using dr2epochflux_pkey on dr2epochflux  (cost=0.42..127154.60 rows=551038 width=1523)
>    ->  Index Scan using dr2light_pkey on dr2light  (cost=0.58..8.49 rows=1 width=132)
>          Index Cond: (source_id = dr2epochflux.source_id)
>          Filter: (parallax > '50'::double precision)

> -- which in reality appears to be a good deal faster than the "bad"
> plan, though still much, much slower than the "good plan".

> Both tables are ANALYZE-d, and they should be reasonably VACUUMED.

> Is there anything I can do to make it easier for the planner to see the
> light?

>            -- Markus






[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux