Search Postgresql Archives

Re: query speed question

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10 Sep 2009, at 19:09, Christopher Condit wrote:

Here are both queries (with explain analyze):

EXPLAIN ANALYZE SELECT B.* FROM B,
(SELECT lat, lon FROM A WHERE value > 0 AND value < 2 AND depth = 0) AS foo WHERE getwoalatitude(B.latitude::numeric) = foo.lat AND getwoalongitude(B.longitude::numeric) = foo.lon

after vacuum analyze:
"Merge Join (cost=17873237.91..26594735.94 rows=1984509 width=420) (actual time=752983.201..941125.197 rows=226941 loops=1)" " Merge Cond: ((a.lat = (getwoalatitude((b.latitude)::numeric))) AND (a.lon = (getwoalongitude((b.longitude)::numeric))))" " -> Sort (cost=2019.51..2037.61 rows=7239 width=16) (actual time=30.704..32.171 rows=2111 loops=1)"
"        Sort Key: a.lat, a.lon"
"        Sort Method:  quicksort  Memory: 212kB"
" -> Index Scan using depthidx on a (cost=0.00..1555.43 rows=7239 width=16) (actual time=0.533..24.631 rows=2111 loops=1)"
"              Index Cond: (depth = 0)"
" Filter: ((value > 0::numeric) AND (value < 2::numeric))"

Here you can see that the estimated row counts are way off compared to the actual number of rows. That means the statistics for that table aren't reliable; either you need to ANALYZE them or you need to increase the statistics size of the columns involved.

This part of the query is the quicker part, but it may be possible that the planner wouldn't have picked a merge-join with the remainder if the estimates were closer to reality.

You can paste those queries in http://explain-analyze.info/, although you'll need to strip the quotes from the plan (not sure where those get added, psql doesn't add them).

" -> Materialize (cost=17871190.21..18083338.96 rows=16971900 width=420) (actual time=751324.751..919278.574 rows=16963350 loops=1)" " -> Sort (cost=17871190.21..17913619.96 rows=16971900 width=420) (actual time=751324.744..820522.604 rows=16963350 loops=1)" " Sort Key: (getwoalatitude((b.latitude)::numeric)), (getwoalongitude((b.longitude)::numeric))"
"              Sort Method:  external merge  Disk: 4599344kB"
" -> Seq Scan on b (cost=0.00..750696.00 rows=16971900 width=420) (actual time=1.781..229158.949 rows=16971901 loops=1)"
"Total runtime: 942295.914 ms"

Here's where most time is spent. Firstly, an external merge to disk is going to be slow. I think the setting controlling the amount of memory available for sorting is work_mem, you can try increasing that in the client (set work_mem TO <some value>) before executing your query and see what value is sufficient.

The other issue here is the sequential scan on table b. I think that's caused by the planner needing to merge to disk though.

EXPLAIN ANALYZE SELECT b.* FROM b JOIN a ON (getwoalatitude (b.latitude::numeric) = a.lat AND getwoalongitude (b.longitude::numeric) = a.lon) WHERE a.value > 0 AND a.value < 2 AND a.depth = 0

"Merge Join (cost=17873237.91..26594735.94 rows=1984509 width=420) (actual time=755255.801..906407.961 rows=226941 loops=1)" " Merge Cond: ((a.lat = (getwoalatitude((b.latitude)::numeric))) AND (a.lon = (getwoalongitude((b.longitude)::numeric))))" " -> Sort (cost=2019.51..2037.61 rows=7239 width=16) (actual time=27.658..29.108 rows=2111 loops=1)"
"        Sort Key: a.lat, a.lon"
"        Sort Method:  quicksort  Memory: 212kB"
" -> Index Scan using depthidx on a (cost=0.00..1555.43 rows=7239 width=16) (actual time=0.467..21.646 rows=2111 loops=1)"
"              Index Cond: (depth = 0)"
" Filter: ((value > 0::numeric) AND (value < 2::numeric))" " -> Materialize (cost=17871190.21..18083338.96 rows=16971900 width=420) (actual time=753605.414..884549.890 rows=16963350 loops=1)" " -> Sort (cost=17871190.21..17913619.96 rows=16971900 width=420) (actual time=753605.407..822844.299 rows=16963350 loops=1)" " Sort Key: (getwoalatitude((b.latitude)::numeric)), (getwoalongitude((b.longitude)::numeric))"
"              Sort Method:  external merge  Disk: 4599344kB"
" -> Seq Scan on b (cost=0.00..750696.00 rows=16971900 width=420) (actual time=0.095..229888.646 rows=16971901 loops=1)"
"Total runtime: 911284.022 ms"

This plan is actually very similar to the original query's plan, it has the same problems.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4aa939f812071577543529!



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux