Search Postgresql Archives

Re: Need help optimizing this query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/18/07, Michael Glaesemann <grzm@xxxxxxxxxxxxxxx> wrote:

On Jul 18, 2007, at 16:48 , Pat Maddox wrote:

> For some reason the functions you wrote are giving me trouble (there's
> a BIGINT involved, I tried changing the functions around but kept
> having issues).

You might try these, if you're interested.

CREATE OR REPLACE FUNCTION ternary(BOOLEAN, BIGINT, BIGINT)
RETURNS BIGINT
LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$;

CREATE OR REPLACE FUNCTION value_when(BOOLEAN, BIGINT)
RETURNS BIGINT
LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$;

CREATE OR REPLACE FUNCTION one_when(BOOLEAN)
RETURNS BIGINT
LANGUAGE SQL as $_$SELECT value_when($1,1)$_$;

> So here's the full query, hopefully formatted better:

I'm still curious about why the planner is choosing a hash join over
using the indexes on the foreign keys, but that might be because the
tables are relatively small.

> That one runs reasonably fine, because there are only 20 videos being
> returned and a handful of video views associated with them.  In the
> real query there are about 1k videos and a couple million views.  That
> took about 80 minutes to run, according to logs.

The planner will choose different plans based on, among other things,
what it estimates the size of the result to be, so while looking at a
small example query might seem like a way to go about looking at
what's going on, it's most likely not going to give you an accurate
representation of the situation. Are you looking at two different
systems (e.g., a development system versus a production system) or
just choosing a smaller query on the same system? If you can't run
the query on your production system, you may want to take a dump of
the production system and set it up on another box. Even with a
couple million rows in the video_views table, PostgreSQL shouldn't
really blink too much, as long as the server is tuned properly, the
hardware is adequate, and the database statistics are up to date.

Michael Glaesemann
grzm seespotcode net





Sorry, I mentioned that it took 90 seconds to run the query but I
didn't show that EXPLAIN ANALYZE output.

Here it is, same query just with a different company_id:

Limit  (cost=879283.07..879283.32 rows=100 width=64) (actual
time=92486.858..92486.891 rows=100 loops=1)
  ->  Sort  (cost=879283.07..879297.15 rows=5632 width=64) (actual
time=92486.856..92486.867 rows=100 loops=1)
        Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND
(video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END)
        ->  GroupAggregate  (cost=803054.95..878932.21 rows=5632
width=64) (actual time=67145.471..92484.408 rows=730 loops=1)
              ->  Sort  (cost=803054.95..809363.98 rows=2523610
width=64) (actual time=67076.407..75441.274 rows=5799447 loops=1)
                    Sort Key: videos.id, videos.title, videos.guid
                    ->  Hash Join  (cost=1220.63..237115.16
rows=2523610 width=64) (actual time=31.230..11507.406 rows=5799447
loops=1)
                          Hash Cond: (video_views.asset_id = assets.id)
                          ->  Hash Join  (cost=535.62..179627.88
rows=2525294 width=60) (actual time=13.286..7621.950 rows=5799447
loops=1)
                                Hash Cond: (video_views.video_id = videos.id)
                                ->  Seq Scan on video_views
(cost=0.00..101352.70 rows=5998470 width=12) (actual
time=0.023..2840.718 rows=5998341 loops=1)
                                ->  Hash  (cost=465.23..465.23
rows=5632 width=52) (actual time=13.216..13.216 rows=5712 loops=1)
                                      ->  Seq Scan on videos
(cost=0.00..465.23 rows=5632 width=52) (actual time=0.038..9.060
rows=5712 loops=1)
                                            Filter: (company_id = 11)
                          ->  Hash  (cost=487.78..487.78 rows=15778
width=12) (actual time=17.876..17.876 rows=15778 loops=1)
                                ->  Seq Scan on assets
(cost=0.00..487.78 rows=15778 width=12) (actual time=0.032..10.880
rows=15778 loops=1)
Total runtime: 92548.006 ms
(17 rows)


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux