I've encountered a very strange result when using a view in a query VS the underlying sql of the view. When accessing the view I get a different query plan than running the sql that makes up the view. Using the view is 10x slower and generates a strange query plan that seems to ignore the existence of an index.
Two tables at play:
session and person. a person has zero or more sessions
The following indexes exist:
- last_seen on person
- person_id on person
- person_id on session
SQL below:
WITH cte AS (
SELECT session_.person_id, count(*) AS session_count
FROM session
GROUP BY session_.person_id
)
SELECT p.*
FROM person_ p
LEFT JOIN cte sc ON (p.person_id = sc.person_id)
ORDER BY p.last_seen DESC
LIMIT 5
Below is the view which simply wraps the above sql (minus the order by and limit) in CREATE VIEW :
CREATE OR REPLACE VIEW person_view_ AS
WITH cte AS (
SELECT session.person_id, count(*) AS session_count
FROM session
GROUP BY session.person_id
)
SELECT p.*
FROM person p
LEFT JOIN cte sc ON (p.person_id = sc.person_id)
When I use the above view like so:
select P.*
from person_view as P
order by last_seen desc
limit 5
I get a slow running query (over 10 seconds) with the following query plan:
Limit (cost=374531.52..374531.59 rows=5 width=374)
-> Subquery Scan on p (cost=374531.52..380250.77 rows=457540 width=374)
-> Sort (cost=374531.52..375675.37 rows=457540 width=3329)
Sort Key: p_1.last_seen
CTE cte
-> GroupAggregate (cost=0.42..37443.15 rows=138975 width=25)
Group Key: session_2f28b2dd_cfa9_40bb_89ed_25482f835838.person_id
-> Index Only Scan using session_2f28b2dd_cfa9_40bb_89ed_25482f835838_person_id_idx on session_2f28b2dd_cfa9_40bb_89ed_25482f835838 (cost=0.42..32164.15 rows=777850 width=25)
-> Hash Right Join (cost=60712.65..294071.52 rows=457540 width=3329)
Hash Cond: (sc.person_id = p_1.person_id)
-> CTE Scan on cte sc (cost=0.00..2779.50 rows=138975 width=108)
-> Hash (cost=54993.40..54993.40 rows=457540 width=3321)
-> Seq Scan on person_2f28b2dd_cfa9_40bb_89ed_25482f835838 p_1 (cost=0.00..54993.40 rows=457540 width=3321)
But when I run the sql in the view directly (first code block in post) I get results in under a second with the following query plan:
Limit (cost=37443.57..53083.10 rows=5 width=3329)
CTE cte
-> GroupAggregate (cost=0.42..37443.15 rows=138975 width=25)
Group Key: session_2f28b2dd_cfa9_40bb_89ed_25482f835838.person_id
-> Index Only Scan using session_2f28b2dd_cfa9_40bb_89ed_25482f835838_person_id_idx on session_2f28b2dd_cfa9_40bb_89ed_25482f835838 (cost=0.42..32164.15 rows=777850 width=25)
-> Nested Loop Left Join (cost=0.42..1431141564.79 rows=457540 width=3329)
Join Filter: (p.person_id = sc.person_id)
-> Index Scan Backward using person_2f28b2dd_cfa9_40bb_89ed_25482f835838_last_seen_idx on person_2f28b2dd_cfa9_40bb_89ed_25482f835838 p (cost=0.42..213565.14 rows=457540 width=3321)
-> CTE Scan on cte sc (cost=0.00..2779.50 rows=138975 width=108)
If you look at both query plans you will see that when querying the view it performs Seq Scan on person but when using the underlying sql its does Index Scan Backward using person
Why are indexes being ignored by postgresql when using the view?
Regards,
Max Kremer