On Thu, Feb 21, 2013 at 6:06 AM, Ivan Voras <ivoras@xxxxxxxxxxx> wrote: > On 21/02/2013 12:52, Ivan Voras wrote: > >> I'd like to use pg_trgm for matching substrings case-insensitively, but >> it doesn't seem to use the index: > > > As a sort-of followup, the '%' operator kind of works but takes > incredibly long time, and the selectivity estimates are completely wrong: > > nn=> vacuum analyze documents; > VACUUM > nn=> explain select id,title from documents where raw_data % 'zagreb'; > QUERY PLAN > > ----------------------------------------------------------------------------------------- > Bitmap Heap Scan on documents (cost=128.42..330.87 rows=54 width=108) > Recheck Cond: ((raw_data)::text % 'zagreb'::text) > -> Bitmap Index Scan on documents_raw_data_trgm (cost=0.00..128.40 > rows=54 width=0) > Index Cond: ((raw_data)::text % 'zagreb'::text) > (4 rows) > > nn=> explain analyze select id,title from documents where raw_data % > 'zagreb'; > QUERY > PLAN > ----------------------------------------------------------------------------------------------------------------------------------------- > Bitmap Heap Scan on documents (cost=128.42..330.87 rows=54 width=108) > (actual time=98750.283..98750.283 rows=0 loops=1) > Recheck Cond: ((raw_data)::text % 'zagreb'::text) > -> Bitmap Index Scan on documents_raw_data_trgm (cost=0.00..128.40 > rows=54 width=0) (actual time=26.748..26.748 rows=51874 loops=1) > Index Cond: ((raw_data)::text % 'zagreb'::text) > Total runtime: 98750.623 ms > (5 rows) > > > There is no IO load during this query. pg_trgm is not really designed for indexing large documents, but for fuzzy simple string (company name, address, etc) matching. probably better off with full text search. merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general