It's not an issue for me (it's not really impacting performance), but since it was odd I thought I might ask. I have this supermegaquery: SELECT t.date AS status_date, lu.id AS memberid, lu.username AS username, u.url AS url, ub.url_pattern AS urlpattern, lu.email AS email, lu.birth_date AS birthdate, lu.creation_date AS creationdate, s.name AS state, co.name AS country, opd.survey_id AS originalSurvey, c.name AS city , lu.confirmed AS confirmed , pd.name AS action , sd.duration AS loi FROM tracks t LEFT JOIN surveyduration_v sd ON sd.member_id = t.member_id AND sd.survey_id = 5936 INNER JOIN all_users_v lu ON lu.id = t.member_id AND lu.panel_source_id = 1 LEFT JOIN track_status ts ON ts.id = t.track_status_id LEFT JOIN partners p ON p.id = t.partner_id LEFT JOIN urls u ON u.id = t.url_id AND u.survey_id = 5936 LEFT JOIN url_batchs ub ON u.url_batch_id = ub.id LEFT JOIN states s ON lu.state_id = s.id LEFT JOIN cities c ON lu.city_id = c.id LEFT JOIN countries co ON lu.country_id = co.id LEFT JOIN partner_deliveries pd ON pd.id = t.partner_delivery_id AND t.partner_id IS NOT NULL LEFT JOIN partner_deliveries opd ON opd.id = pd.originator_id WHERE t.survey_id = 5936 AND t.track_status_id IN (5) With the views CREATE OR REPLACE VIEW surveyduration_v AS SELECT date_part('epoch'::text, t.date - tl2.date) / 60::double precision AS duration, t.member_id, t.survey_id FROM tracks t JOIN track_logs tl2 ON t.id = tl2.track_id WHERE tl2.track_status_id = 8 AND t.track_status_id = 7; CREATE OR REPLACE VIEW all_users_v AS SELECT 1 AS panel_source_id, livra_users.id, livra_users.birth_date, livra_users.creation_date, livra_users.email, livra_users.first_name, livra_users.last_name, livra_users.username, livra_users.image_link, livra_users.confirmed, livra_users.is_panelist, livra_users.unregistered, livra_users.reason, livra_users.privacy, livra_users.sex, livra_users.site, livra_users.country_id, livra_users.state_id, livra_users.city_id, livra_users.last_activity_date, livra_users.partner_id, livra_users.survey_id, livra_users.panelist_update, livra_users.panelist_percentage FROM livra_users UNION ALL SELECT 2 AS panel_source_id, - external_users.id AS id, NULL::timestamp without time zone AS birth_date, external_users.creation_date, external_users.email, NULL::character varying AS first_name, NULL::character varying AS last_name, external_users.username, NULL::character varying AS image_link, true AS confirmed, external_users.is_panelist, false AS unregistered, NULL::integer AS reason, 0 AS privacy, NULL::integer AS sex, external_users.site, external_users.country_id, NULL::integer AS state_id, NULL::integer AS city_id, NULL::timestamp without time zone AS last_activity_date, NULL::integer AS partner_id, external_users.survey_id, NULL::bigint AS panelist_update, NULL::smallint AS panelist_percentage FROM external_users; Server is 9.0.3 running on linux The BIG tables are tracks, track_logs and urls, all > 30M rows. One detail that could be related is that tracks.member_id is an undeclared (denoramlized) foreign key to livra_users. The resulting plan is: "Hash Left Join (cost=51417.93..974563.27 rows=2241518 width=1276)" " Hash Cond: ("*SELECT* 1".country_id = co.id)" " -> Hash Left Join (cost=51415.40..941722.50 rows=2241518 width=1271)" " Hash Cond: ("*SELECT* 1".state_id = s.id)" " -> Hash Left Join (cost=51373.45..910859.68 rows=2241518 width=1263)" " Hash Cond: (t.partner_delivery_id = pd.id)" " Join Filter: (t.partner_id IS NOT NULL)" " -> Hash Left Join (cost=32280.78..854175.26 rows=2241518 width=1256)" " Hash Cond: ("*SELECT* 1".city_id = c.id)" " -> Hash Join (cost=24183.20..792841.63 rows=2241518 width=1249)" " Hash Cond: ("*SELECT* 1".id = t.member_id)" " -> Append (cost=0.00..148254.38 rows=3008749 width=168)" " -> Subquery Scan on "*SELECT* 1" (cost=0.00..140223.96 rows=3008748 width=168)" " -> Seq Scan on livra_users (cost=0.00..110136.48 rows=3008748 width=168)" " -> Subquery Scan on "*SELECT* 2" (cost=0.00..8030.42 rows=1 width=60)" " -> Result (cost=0.00..8030.41 rows=1 width=60)" " One-Time Filter: false" " -> Seq Scan on external_users (cost=0.00..8030.41 rows=1 width=60)" " -> Hash (cost=24181.34..24181.34 rows=149 width=188)" " -> Hash Left Join (cost=21650.42..24181.34 rows=149 width=188)" " Hash Cond: (u.url_batch_id = ub.id)" " -> Nested Loop Left Join (cost=20828.08..23355.84 rows=149 width=115)" " -> Merge Left Join (cost=20828.08..20841.04 rows=149 width=44)" " Merge Cond: (t.member_id = t.member_id)" " -> Sort (cost=435.90..436.27 rows=149 width=32)" " Sort Key: t.member_id" " -> Index Scan using idx_tracks_survey_id_track_status_id on tracks t (cost=0.00..430.52 rows=149 width=32)" " Index Cond: ((survey_id = 5936) AND (track_status_id = 5))" " -> Sort (cost=20392.18..20398.28 rows=2440 width=20)" " Sort Key: t.member_id" " -> Nested Loop (cost=0.00..20254.90 rows=2440 width=20)" " -> Index Scan using idx_tracks_survey_id_track_status_id on tracks t (cost=0.00..2010.03 rows=712 width=20)" " Index Cond: ((survey_id = 5936) AND (track_status_id = 7))" " -> Index Scan using idx_track_logs_track_id on track_logs tl2 (cost=0.00..25.59 rows=3 width=16)" " Index Cond: (tl2.track_id = t.id)" " Filter: (tl2.track_status_id = 8)" " -> Index Scan using urls_pkey on urls u (cost=0.00..16.87 rows=1 width=87)" " Index Cond: (u.id = t.url_id)" " Filter: (u.survey_id = 5936)" " -> Hash (cost=637.15..637.15 rows=14815 width=81)" " -> Seq Scan on url_batchs ub (cost=0.00..637.15 rows=14815 width=81)" " -> Hash (cost=4578.37..4578.37 rows=281537 width=15)" " -> Seq Scan on cities c (cost=0.00..4578.37 rows=281537 width=15)" " -> Hash (cost=18270.17..18270.17 rows=65799 width=19)" " -> Hash Left Join (cost=8842.48..18270.17 rows=65799 width=19)" " Hash Cond: (pd.originator_id = opd.id)" " -> Seq Scan on partner_deliveries pd (cost=0.00..8019.99 rows=65799 width=19)" " -> Hash (cost=8019.99..8019.99 rows=65799 width=8)" " -> Seq Scan on partner_deliveries opd (cost=0.00..8019.99 rows=65799 width=8)" " -> Hash (cost=24.20..24.20 rows=1420 width=16)" " -> Seq Scan on states s (cost=0.00..24.20 rows=1420 width=16)" " -> Hash (cost=1.68..1.68 rows=68 width=13)" " -> Seq Scan on countries co (cost=0.00..1.68 rows=68 width=13)" The curious bit is the rowcount (2241518) which is grossly misestimated. It gets to that rowcount when joining the all_users_v view with tracks, both partial results are estimated at ~150 rows (more or less on target), it's a join of int PK to int column, so I cannot imagine how that join could result in 2M rows, what is pg thinking to get to that number? Even a full cross product couldn't get that high. Performance isn't impacted, the plan, even with the misestimation, is near optimal. But I can imagine this kind of misestimation wreaking havoc in other situations. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance