On 06/21/2016 03:33 PM, Jonathan Vanasco wrote:
In effort of simplifying the work, I've created indexes on t_a that have all the related columns.
CREATE INDEX test_idx ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;
CREATE INDEX test_idx__a ON t_a(col_1, id) WHERE col_2 IS NOT FALSE;
Aside from the name these indexes are identical...
postgres will query test_idx__a first (yay!) but then does a bitmap heap scan on t_a, and uses the raw t_a for the hash join.
I don't actually need any information from t_a - it's just there for the filtering, and ideally postgres would just use the index.
This is the description of a semi-join.
WHERE EXISTS (SELECT 1 FROM t_a WHERE t_a.id = t_a2b.a_id AND t_a.col_1 = 730 AND t_a.col_2 IS NOT FALSE)
I thought this might have been from using a partial index, but the same results happen with a full index. I just can't seem to avoid this hash join against the full table.
anyone have a suggestion?
The below works without including t_a in the FROM?
example query
SELECT t_a2b.b_id AS b_id,
count(t_a2b.b_id) AS counted
FROM t_a2b
WHERE
t_a2b.col_a = 1
AND
t_a.col_1 = 730
AND
t_a.col_2 IS NOT False
GROUP BY t_a2b.b_id
ORDER BY counted DESC,
t_a2b.b_id ASC
These two items combined reduce the desirability of diagnosing this...it doesn't seem like you've faithfully recreated the scenario for us to evaluate.
Your post is also not self-contained and you haven't provided the actual EXPLAINs you are getting.
David J.