Hello:
I'm an ignorant in what refers to performance analysis of PostgreSQL.
I've a doubt about how the PostgreSQL planner makes a hash join. I've
tried to "dig" into the archive of this mailing list but I haven't found
what I'm looking for. So I'm explaining my doubt with an example to see
if anyone can help me.
Let's suppose that I've 2 tables, one of students and the other one of
parents in a many-to-one relation. I want to do something like this:
SELECT s.complete_name, f.complete_name
FROM students AS s
JOIN fathers AS f ON f.id_father = s.id_father;
Using the ANALYZE command, I've checked that the planner firstly scans
and extracts the required information from "fathers", builds a temporary
hash table from it, then scans "students", and finally joins the
information from this table and the temporary one employing the relation
"f.id_father = s.id_father".
My doubt is about this last step. When the planner checks the temporary
table looking for the parent of a student:
A) Does it run through the temporary father's table one time per
student? This means that if there are 500 students, it's doing 500 loops
on the temporary table.
B) Or does it try to internally group students with the same father ID
to avoid doing "absurd" loops on the temporary one?
That's all. Thank you very much for your kindness :) .
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance