Before trying other solutions I would like to make PG use an index-only scan (it should be fast enough for our purpose).
I have tried to disable the other indexes and forced PG to use this index (which includes all the fields of the query):
index_subscriptions_on_project_id_and_created_at_and_tags
The problem is that the query plan is this:
As you can see it is a *index scan* and not an *index only* scan... I don't understand why. The index includes all the fields used by the query... so an index only scan should be possible.
On Fri, Jan 10, 2020 at 2:34 PM Justin Pryzby <pryzby@xxxxxxxxxxxxx> wrote:
On Fri, Jan 10, 2020 at 12:03:39PM +0100, Marco Colli wrote:
> I have added this index which would allow an index only scan:
> "index_subscriptions_on_project_id_and_created_at_and_tags" btree
> (project_id, created_at DESC, tags) WHERE trashed_at IS NULL
Are those the only columns in subscriptions ?
> But Postgresql continues to use this index (which has less information and
> then requires slow access to disk):
> "index_subscriptions_on_project_id_and_created_at" btree (project_id,
> created_at DESC)
Did you vacuum the table ?
Did you try to "explain" the query after dropping the 1st index (like: begin;
DROP INDEX..; explain analyze..; rollback).
Also, is the first (other) index btree_gin (you can \dx to show extensions) ?
I think it needs to be a gin index to search tags ?
On Fri, Jan 10, 2020 at 01:42:24PM +0100, Marco Colli wrote:
> I would like to try your solution but I read that ALTER TABLE... SET
> STATISTICS locks the table... Since it is just an experiment and we don't
> know if it actually works it would be greate to avoid locking a large table
> (50M) in production.
I suggest to CREATE TABLE test_subscriptions (LIKE subscriptions INCLUDING
ALL); INSERT INTO test_subscriptions SELECT * FROM subscriptions; ANALYZE test_subscriptions;
Anyway, ALTER..SET STATS requires a strong lock but for only a brief moment
(assuming it doesn't have to wait). Possibly you'd be ok doing SET
statement_timeout='1s'; ALTER TABLE....
> Does CREATE STATISTICS lock the table too?
You can check by SET client_min_messages=debug; SET lock_timeout=333; SET log_lock_waits=on;
Looks like it needs ShareUpdateExclusiveLock.
> Does statistics work on an array field like tags? (I can't find any
> information)
It think it'd be data type agnostic. And seems to work with arrays.
On Fri, Jan 10, 2020 at 02:30:27PM +0100, Marco Colli wrote:
> @Justin Pryzby I have tried this as you suggested:
>
> CREATE STATISTICS statistics_on_subscriptions_project_id_and_tags ON
> project_id, tags FROM subscriptions;
> VACUUM ANALYZE subscriptions;
>
> Unfortunately nothing changes and Postgresql continues to use the wrong
> plan (maybe stats don't work well on array fields like tags??).
It'd help to see SELECT stxddependencies FROM pg_statistic_ext WHERE
stxoid='subscriptions'::regclass