Hi Tom, thanks for the great job getting to the core of this
problem... I would say I'm not sure I want randomize the rows (not
really even sure how to do it without truncating the table and re-
adding the records in a random order). I think for the moment I
will either a) re-write the query per Ismo's suggestion, or b) wait
until more data comes into that table, potentially kicking the query
planner into not using the Nested Loop anymore.
Anyway, thanks again, I appreciate it...
-Jeff
On Mar 7, 2007, at 11:37 AM, Tom Lane wrote:
Jeff Cole <cole.jeff@xxxxxxxxx> writes:
Hi Tom, you are correct, the distribution is uneven... In the 13k
symptom_reports rows, there are 105 distinct symptom_ids. But the
first 8k symptom_reports rows only have 10 distinct symptom_ids.
Could this cause the problem and would there be anything I could do
to address it?
Ah-hah, yeah, that explains it. Is it worth your time to deliberately
randomize the order of the rows in symptom_reports? It wasn't clear
whether this query is actually something you need to optimize. You
might have other queries that benefit from the rows being in nonrandom
order, so I'm not entirely sure that this is a good thing to do ...
regards, tom lane