On 7/18/07, Michael Glaesemann <grzm@xxxxxxxxxxxxxxx> wrote:
On 7/18/07, Pat Maddox <pergesu@xxxxxxxxx> wrote: > I've got a query that's taking forever (as will be obvious when you > see it and the explain output). I can't figure out what indexes to > add to make this run faster. I'd appreciate any help. > I'm curious why it's choosing to use hash joins rather than taking advantage of the indexes you have on the foreign key columns. What are the table definitions? Are hit, logged_in, played, downloaded all columns of videos_views? Have you ANALYZEd these tables? What is the EXPLAIN ANALYZE output for this query? You only provided the EXPLAIN output, which doesn't compare the plan with the actual query cost. I found it quite difficult to read you query. I reformatted it and also used some SQL functions to abstract away the CASE expressions. These SQL functions will probably be inlined so there should be very little overhead. If you have a boolean column, you don't need to test IS TRUE or IS FALSE: you can just use the value itself. I also find it helpful to separate the join conditions (in the JOIN clause) from the restrictions (the WHERE clause), which I've done below. CREATE OR REPLACE FUNCTION ternary(BOOLEAN, INTEGER, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT CASE WHEN $1 THEN $2 ELSE $3 END$_$; CREATE OR REPLACE FUNCTION value_when(BOOLEAN, INTEGER) RETURNS INTEGER LANGUAGE SQL AS $_$SELECT ternary($1,$2,0)$_$; CREATE OR REPLACE FUNCTION one_when(BOOLEAN) RETURNS INTEGER LANGUAGE SQL as $_$SELECT value_when($1,1)$_$; SELECT SUM (one_when(hit AND logged_in)) AS count_hits_console , SUM (one_when(hit AND NOT logged_in)) AS count_hits_remote , SUM (one_when(played AND logged_in)) AS count_played_console , SUM (one_when(played AND NOT logged_in)) AS count_played_remote , SUM (one_when(downloaded AND logged_in)) AS count_downloaded_console , SUM (one_when(downloaded AND NOT logged_in)) AS count_downloaded_remote , SUM (value_when((played OR downloaded) AND logged_in, assets.size)) as download_size_console , SUM (value_when((played OR downloaded) AND NOT logged_in), assets.size) as download_size_remote , videos.id , videos.title , videos.guid FROM video_views JOIN assets ON (video_views.video_id=videos.id) JOIN videos ON (video_views.asset_id=assets.id) WHERE videos.company_id=1 GROUP BY videos.id , videos.title , videos.guid ORDER BY count_hits_remote DESC LIMIT 100 Michael Glaesemann grzm seespotcode net
Michael, I tried your SQL but it didn't work - it was missing the videos table in the FROM clause. But when I add it, I get the error: ERROR: invalid reference to FROM-clause entry for table "video_views" LINE 20: JOIN assets ON (video_views.video_id=videos.id) ^ HINT: There is an entry for table "video_views", but it cannot be referenced from this part of the query. Not really sure what that means. Here are the table definitions: twistage_development=# \d video_views Table "public.video_views" Column | Type | Modifiers ------------+-----------------------------+---------------------------------------------------------- id | integer | not null default nextval('video_views_id_seq'::regclass) video_id | integer | created_at | timestamp without time zone | asset_id | integer | played | boolean | default false downloaded | boolean | default false hit | boolean | default false logged_in | boolean | default false Indexes: "video_views_pkey" PRIMARY KEY, btree (id) "index_video_views_on_asset_id" btree (asset_id) "index_video_views_on_video_id" btree (video_id) "index_video_views_on_video_id_and_asset_id_and_created_at" btree (video_id, created_at, asset_id) twistage_development=# \d videos Table "public.videos" Column | Type | Modi fiers -----------------------+-----------------------------+-------------------------- --------------------------- id | integer | not null default nextval( 'videos_id_seq'::regclass) title | character varying(255) | duration | double precision | description | text | status | character varying(255) | user_id | integer | created_at | timestamp without time zone | upload_finished | boolean | default false publisher_name | character varying(255) | company_id | integer | available_for_display | boolean | default true guid | character varying(255) | main_asset_id | integer | container_type | character varying(255) | codec | character varying(255) | site_id | integer | deleted_at | timestamp without time zone | purged_at | timestamp without time zone | remote_hits_count | integer | default 0 Indexes: "videos_pkey" PRIMARY KEY, btree (id) "complete_videos_without_deleted_at" btree (company_id, status) "index_complete_videos" btree (deleted_at, purged_at, status, created_at, co mpany_id) "index_videos_on_company_id" btree (company_id) "index_videos_on_deleted_at_and_status_and_site_id" btree (status, deleted_a t, site_id) "index_videos_on_guid" btree (guid) "index_videos_on_publisher_name" btree (publisher_name) "index_videos_on_site_id" btree (site_id) "index_videos_on_user_id" btree (user_id) twistage_development=# \d assets Table "public.assets" Column | Type | Modifiers -----------------+-----------------------------+----------------------------------------------------- id | integer | not null default nextval('assets_id_seq'::regclass) video_id | integer | video_format_id | integer | guid | character varying(255) | source_path | character varying(255) | size | bigint | vresolution | integer | hresolution | integer | video_bitrate | integer | frame_rate | integer | container | character varying(255) | vcodec | character varying(255) | status | character varying(255) | deleted_at | timestamp without time zone | audio_bitrate | integer | acodec | character varying(255) | duration | double precision | Indexes: "assets_pkey" PRIMARY KEY, btree (id) "index_assets_on_video_format_id" btree (video_format_id) "index_assets_on_video_id" btree (video_id)