Laurenz Albe <laurenz.albe@xxxxxxxxxxx> writes: > On Tue, 2021-11-30 at 22:38 +0100, pgsql-performance@xxxxxxxxxx wrote: >> INSERT INTO song (artist, title) >> SELECT 'artist','title' >> FROM generate_series(1,10000); >> >> \set query '12345678' >> >> -- This query is slow >> EXPLAIN ANALYZE >> SELECT song.artist, song.title >> FROM song >> WHERE (song.artist %> :'query' OR song.title %> :'query') >> ; > The table is quite small; with a bigger table, the test would be more meaningful. Yeah, this test case seems very unrealistic, both as to table size and as to the lack of variability of the table entries. I think the latter is causing the indexscans to take less time than they otherwise might, because none of the extracted trigrams find any matches. > Since you have SSDs, you should tune "random_page_cost = 1.1". Right. Poking at gincostestimate a bit, I see that for this operator the indexscan cost estimate is basically driven by the number of trigrams extracted from the query string (nine in this test case) and the index size; those lead to a predicted number of index page fetches that's then scaled by random_page_cost. That's coming out to make it look more expensive than the seqscan. It's actually not more expensive, but that's partially because page fetch costs are really zero in this test case (everything will stay in shared buffers the whole time), and partially because the unrealistic data pattern is leading to not having to look at as much of the index as gincostestimate expected. In general, it appears correct that longer query strings lead to a higher index cost estimate, because they produce more trigrams so there's more work for the index match to do. (At some level, a longer query means more work in the seqscan case too; but our cost models are inadequate to predict that.) regards, tom lane