Michael Lewis <mlewis@xxxxxxxxxxx> writes: > On Mon, Dec 28, 2020 at 7:51 AM Christopher Baines <mail@xxxxxxxxxxx> wrote: > >> derivation_inputs: >> COUNT(*): 285422539 >> reltuples: 285422528 >> >> derivation_id: >> COUNT(DISTINCT): 7508610 >> n_distinct: 4336644 (~57% of the true value) >> >> derivation_output_id: >> COUNT(DISTINCT): 5539406 >> n_distinct: 473762 (~8% of the true value) >> > > If you expect the ratio of distinct of derivation_output_id values to be > roughly linear going forward, you can set a custom value for n_distinct on > the column (currently seems like -.0194, aka distinct count > of derivation_output_id divided by reltuples of the table). You could also > do this analysis every month or six and set the custom value as needed. > > https://www.postgresql.org/docs/current/sql-altertable.html > > I am not sure if it will resolve your query problems though. Thanks Michael, I didn't realise a custom value could be set, but I'll look in to this. I actually managed to speed the query up enough by increasing work_mem/shared_buffers. I didn't realise one of the sequential scans was executing 14 times, but giving PostgreSQL more resources means it just executes once, which really helps. Thanks again, Chris
Attachment:
signature.asc
Description: PGP signature