På lørdag 03. mai 2014 kl. 23:21:21, skrev Alban Hertroys <haramrae@xxxxxxxxx>:
On 03 May 2014, at 12:45, Andreas Joseph Krogh <andreas@xxxxxxxxxx> wrote:
> Do you really need to query message_property twice? I would think this would give the same results:
>
> SELECT
> m.id AS message_id,
> prop.person_id,
> coalesce(prop.is_read, FALSE) AS is_read,
> m.subject
> FROM message m
> LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1 AND prop.is_read = FALSE
> ;
Ah yes, of course that would match a bit too much. This however does give the same results:
SELECT
m.id AS message_id,
prop.person_id,
coalesce(prop.is_read, FALSE) AS is_read,
m.subject
FROM message m
LEFT OUTER JOIN message_property prop ON prop.message_id = m.id AND prop.person_id = 1
WHERE prop.is_read IS NULL OR prop.is_read = FALSE
;
That shaves off half the time of the query here, namely one indexscan.
The remaining time appears to be spent finding the rows in “message" that do not have a corresponding “message_property" for the given (message_id, person_id) tuple. It’s basically trying to find no needle in a haystack, you won’t know that there is no needle until you’ve searched the entire haystack.
It does seem to help a bit to create separate indexes on message_property.message_id and message_property.person_id; that reduces the sizes of the indexes that the database needs to match and merge other in order to find the missing message_id’s.
I think the consesus here is to create a caching-table, there's no way around it as PG is unable to index the difference between two sets.
--
Andreas Jospeh Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963