Hi All!
I'm currently writing a view that joins many tables and I was wondering how PostgreSQL optimizes
I have created a minimal example. Suppose we have two tables:
> create table a(a_id int primary key generated always as identity, a_data text);
> create table b(b_id int primary key generated always as identity, a_id int not null references a(a_id), b_data text);
Then the query
> explain select b_id from b natural left join a;
results in a `Seq Scan on b`. Whereas the query
> explain select b_id from b natural join a;
results in a join with sequential scans on both a and b.
I believe because b.a_id is not null and references a.a_id a left and an inner join are exactly equivalent.
My questions are:
- Am I wrong that in such a situation a left and inner join are equivalent?
I'm currently writing a view that joins many tables and I was wondering how PostgreSQL optimizes
projections on such a view. In particular I was wondering if it is a correct and valid optimization
technique to use left joins when they are equivalent to an inner join.
I have created a minimal example. Suppose we have two tables:
> create table a(a_id int primary key generated always as identity, a_data text);
> create table b(b_id int primary key generated always as identity, a_id int not null references a(a_id), b_data text);
Then the query
> explain select b_id from b natural left join a;
results in a `Seq Scan on b`. Whereas the query
> explain select b_id from b natural join a;
results in a join with sequential scans on both a and b.
I believe because b.a_id is not null and references a.a_id a left and an inner join are exactly equivalent.
My questions are:
- Am I wrong that in such a situation a left and inner join are equivalent?
- Why does PostgreSQL not automatically optimize this?
- Is it a bad idea to use left joins to optimize this even if semantically an inner join would be correct?
Thank you for your help!