On Tue, May 23, 2017 at 4:03 AM, Gunnar "Nick" Bluth <gunnar.bluth.extern@xxxxxxxxx> wrote:
Hi Jeff,Am 05/22/2017 um 09:57 PM schrieb Jeff Janes:
>
> create view view2 as select id,
> (
> select md5 from thing_alias where thing_id=id
> order by priority desc limit 1
> ) as md5,
> cutoff from thing;
>
> Cheers,
>
> Jeff
how does something like
CREATE OR REPLACE VIEW public.view3 AS
SELECT thing.id,
foo.md5,
thing.cutoff
FROM thing,
LATERAL ( SELECT DISTINCT ON (thing_alias.thing_id)
thing_alias.thing_id,
thing_alias.md5
FROM thing_alias
WHERE thing_alias.thing_id = thing.id
ORDER BY thing_alias.thing_id, thing_alias.priority DESC) foo
work for you? At least that's always using an index scan here, as
opposed to view1, which (for me) defaults to a SeqScan on thing_alias at
a low cutoff.
Unfortunately that always uses the index scan, even at a high cutoff where aggregation on the seq scan and then hash joining is more appropriate. So it is very similar to view2, except that it doesn't return the rows from "thing" which have zero corresponding rows in thing_alias.
*****
Note btw. that both view1 and view2 don't return any md5 values for me,
while view3 does!
*****
Because of the way I constructed the data, using the power transform of the uniform random distribution, the early rows of the view (if sorted by thing_id) are mostly null in the md5 column, so if you only look at the first few screen-fulls you might not see any md5. But your view does effectively an inner join rather than a left join, so your view gets rid of the rows with a NULL md5. Most things don't have aliases; of the things that do, most have 1; and some have a several.
Cheers,
Jeff