Search Postgresql Archives

Re: Need help optimizing this query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/18/07, Michael Glaesemann <grzm@xxxxxxxxxxxxxxx> wrote:

On Jul 18, 2007, at 16:12 , Pat Maddox wrote:

> ERROR:  invalid reference to FROM-clause entry for table "video_views"
> LINE 20: JOIN assets ON (video_views.video_id=videos.id)
>                         ^
> HINT:  There is an entry for table "video_views", but it cannot be
> referenced from this part of the query.

It's because I mismatched the JOIN clauses during my copy-and-paste :(

> On 7/18/07, Michael Glaesemann <grzm@xxxxxxxxxxxxxxx> wrote:

>> FROM video_views
>> JOIN assets ON (video_views.video_id=videos.id)
>> JOIN videos ON (video_views.asset_id=assets.id)

This should be

FROM video_views
JOIN assets ON (video_views.asset_id=assets.id)
JOIN videos ON (video_views.video_id=videos.id)

Do you have the EXPLAIN ANALYE output of the query?

Michael Glaesemann
grzm seespotcode net




For some reason the functions you wrote are giving me trouble (there's
a BIGINT involved, I tried changing the functions around but kept
having issues).  So here's the full query, hopefully formatted better:

SELECT
   SUM(CASE WHEN (hit IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0
END) AS count_hits_console,
   SUM(CASE WHEN (hit IS TRUE AND logged_in IS FALSE) THEN 1 ELSE 0
END) AS count_hits_remote,
   SUM(CASE WHEN (played IS TRUE AND logged_in IS TRUE) THEN 1 ELSE 0
END) AS count_played_console,
   SUM(CASE WHEN (played IS TRUE AND logged_in IS FALSE) THEN 1 ELSE
0 END) AS count_played_remote,
   SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS TRUE) THEN 1
ELSE 0 END) AS count_downloaded_console,
   SUM(CASE WHEN (downloaded IS TRUE AND logged_in IS FALSE) THEN 1
ELSE 0 END) AS count_downloaded_remote,
   SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND
logged_in IS TRUE) THEN assets.size ELSE 0 END) as
download_size_console,
   SUM(CASE WHEN ((played IS TRUE OR downloaded IS TRUE) AND
logged_in IS FALSE) THEN assets.size ELSE 0 END) AS
download_size_remote,
   videos.id,
   videos.title,
   videos.guid
FROM video_views
JOIN assets ON (video_views.asset_id=assets.id)
JOIN videos on (video_views.video_id=videos.id)
WHERE videos.company_id=1
GROUP BY  videos.id,
         videos.title,
         videos.guid
ORDER BY count_hits_remote DESC
LIMIT 100



and here's the EXPLAIN ANALYZE output:

Limit  (cost=127072.90..127073.12 rows=87 width=64) (actual
time=2636.560..2636.567 rows=20 loops=1)
  ->  Sort  (cost=127072.90..127073.12 rows=87 width=64) (actual
time=2636.558..2636.562 rows=20 loops=1)
        Sort Key: sum(CASE WHEN ((video_views.hit IS TRUE) AND
(video_views.logged_in IS FALSE)) THEN 1 ELSE 0 END)
        ->  HashAggregate  (cost=127067.49..127070.10 rows=87
width=64) (actual time=2636.481..2636.506 rows=20 loops=1)
              ->  Hash Join  (cost=880.96..125995.46 rows=38983
width=64) (actual time=24.904..2635.719 rows=122 loops=1)
                    Hash Cond: (video_views.asset_id = assets.id)
                    ->  Hash Join  (cost=195.96..124433.01 rows=39009
width=60) (actual time=8.327..2618.982 rows=122 loops=1)
                          Hash Cond: (video_views.video_id = videos.id)
                          ->  Seq Scan on video_views
(cost=0.00..101352.70 rows=5998470 width=12) (actual
time=0.031..1410.231 rows=5998341 loops=1)
                          ->  Hash  (cost=194.87..194.87 rows=87
width=52) (actual time=1.001..1.001 rows=90 loops=1)
                                ->  Bitmap Heap Scan on videos
(cost=4.93..194.87 rows=87 width=52) (actual time=0.111..0.840 rows=90
loops=1)
                                      Recheck Cond: (company_id = 1)
                                      ->  Bitmap Index Scan on
index_videos_on_company_id  (cost=0.00..4.90 rows=87 width=0) (actual
time=0.079..0.079 rows=90 loops=1)
                                            Index Cond: (company_id = 1)
                    ->  Hash  (cost=487.78..487.78 rows=15778
width=12) (actual time=16.527..16.527 rows=15778 loops=1)
                          ->  Seq Scan on assets  (cost=0.00..487.78
rows=15778 width=12) (actual time=0.023..9.601 rows=15778 loops=1)
Total runtime: 2637.043 ms
(17 rows)


That one runs reasonably fine, because there are only 20 videos being
returned and a handful of video views associated with them.  In the
real query there are about 1k videos and a couple million views.  That
took about 80 minutes to run, according to logs.

Pat


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux