Hi,
Thanks for providing all this info :)
On Mon, Jun 25, 2018 at 05:55:49PM +0200, Roman Kushnir wrote:
Hello,
The following basic inner join is taking too much time for me. (I’m using count(videos.id <http://videos.id/>) instead of count(*) because my actual query looks different, but I simplified it here to the essence).
I’ve tried following random people's suggestions and adjusting the random_page_cost(decreasing it from 4 to 1.1) without a stable improvement. Any hints on what is wrong here? Thank you.
Running on Amazon RDS, with default 10.1 parameters
All default ?
https://wiki.postgresql.org/wiki/Server_ConfigurationIt looks like nearly the entire time is spent reading this table:
Parallel Seq Scan on videos ... (ACTUAL TIME=0.687..55,555.774...)
Buffers: shared hit=7138 read=464357
Perhaps shared_buffers should be at least several times larger, and perhaps up
to 4gb to keep the entire table in RAM. You could maybe also benefit from
better device readahead (blockdev --setra or lvchange -r or
/sys/block/sd?/queue/read_ahead_kb)
Also, it looks like there's a row count misestimate, which probably doesn't
matter for the query you sent, but maybe affects your larger query:
Hash Join (... ROWS=365,328 ... ) (... ROWS=45,307 ... )
If that matters, maybe it'd help to increase statistics on channel_id.
Actually, I see both tables have FK into channels.id:
"fk_rails_11d6d9bea2" FOREIGN KEY (channel_id) REFERENCES channels(id)
"fk_rails_257f68ae55" FOREIGN KEY (channel_id) REFERENCES channels(id)
I don't see the definition of "channels" (and it looks like the query I put on
the wiki doesn't show null_frac), but I think that postgres since 9.6 should be
able to infer good join statistics from the existence of the FKs. Maybe that
only works if you actually JOIN to the channels table (?). But if anything
that's only a 2ndary problem, if at all.
Justin