AlexK987 <alex.cue.987@xxxxxxxxx> writes: >>> I've created a GIN index on an INT[] column, but it slows down the selects. >>> Here is my table: >> >>> create table talent(person_id INT NOT NULL, >>> skills INT[] NOT NULL); >> >>> insert into talent(person_id, skills) >>> select generate_series, array[0, 1] || generate_series >>> from generate_series(3, 1048575); >> >>> create index talent_skills on talent using gin(skills); >> >>> analyze talent; >> >>> Here is my select: >> >>> explain analyze >>> select * from talent >>> where skills <@ array[1, 15] >> >>Well, that's pretty much going to suck given that data distribution. >>Since "1" is a member of every last entry, the GIN scan will end up >>examining every entry, and then rejecting all of them as not being >>true subsets of [1,15]. > >This is equivalent and fast: > >explain analyze >WITH rare AS ( > select * from talent > where skills @> array[15]) >select * from rare > where skills @> array[1] > -- (with changed operator) > >You might variate your query according to an additional table that keeps the occurrence count of all skills. >Not really pretty though. I wonder if in such cases, the Bitmap Index Scan could discard entries that would result in a table scan and use them only in the recheck part: explain select * from talent where skills @> array[1] Seq Scan on talent (cost=0.00..21846.16 rows=1048573 width=37) Filter: (skills @> '{1}'::integer[]) -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance