I want to use pg_trgm for auto-suggest functionality. I
created a Materialized View with the information that I need,
with the relevant columns being (keywords text, rank int).
keywords is the column from which I build the tri-grams, and
rank is some popularity factor so that popular results will show
up higher than less popular results given the same tri-gram
distance.
I want to return results in the order of [distance],
[distance_word], [rank]. The input comes from the user and is
not known in advance. My query is as follows:
SELECT title
,id
,(input <-> keywords) AS distance
,(input <<-> keywords) AS
distance_word
,rank
FROM (VALUES (cast('red pill' AS text)))
consts(input)
,mv_autosuggest
ORDER BY 3, 4, 5
LIMIT 20;
This gives me pretty good results, but it takes too long and is
not likely to scale well.
I have created two indexes but neither seem to be used:
CREATE INDEX mv_autosuggest_keywords_tgrm_gist ON
staging.mv_autosuggest USING gist (keywords gist_trgm_ops);
CREATE INDEX mv_autosuggest_keywords_tgrm_gin ON
staging.mv_autosuggest USING gin (keywords gin_trgm_ops);
This is the result of explain analyze:
QUERY
PLAN
|
-------------------------------------------------------------------------------------------------------------------------------------------|
Limit (cost=356.41..356.46 rows=20 width=51) (actual
time=163.132..163.135 rows=20
loops=1) |
-> Sort (cost=356.41..372.96 rows=6619 width=51)
(actual time=163.130..163.131 rows=20
loops=1) |
Sort Key: (('red pill'::text <->
mv_autosuggest.keywords)), (('red pill'::text <<->
mv_autosuggest.keywords)), mv_autosuggest.rank |
Sort Method: top-N heapsort Memory:
28kB
|
-> Seq Scan on mv_autosuggest (cost=0.00..180.29
rows=6619 width=51) (actual time=0.263..161.289 rows=6619
loops=1) |
Planning time: 0.139
ms
|
Execution time: 163.174
ms
|
How can I improve the performance here?
This can be improved if you use sort only by distances