Search Postgresql Archives

Re: Similarity search for sentences

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Janek Sendrowski <janek12@xxxxxx> wrote:

> I didn't know that the pg_trgm Module provides KNN search

It does, although my own experience shows that it tends to be more
appropriate for name searches or similar smaller columns than for
big text columns.  Using the war_and_peace table from another
thread:

test=# CREATE INDEX war_and_peace_linetext_trgm2 ON war_and_peace
using gist (linetext gist_trgm_ops);
CREATE INDEX
test=# VACUUM ANALYZE war_and_peace;
VACUUM
test=# -- Use a KNN search for the words.
EXPLAIN ANALYZE
SELECT * FROM war_and_peace
  ORDER BY 'ladies gentlemen provinces distance' <-> linetext
  LIMIT 10;
                                                                          QUERY PLAN                                                                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.28..2.74 rows=10 width=116) (actual time=42.157..42.246 rows=10 loops=1)
   ->  Index Scan using war_and_peace_linetext_trgm2 on war_and_peace  (cost=0.28..16016.42 rows=65007 width=116) (actual time=42.155..42.243 rows=10 loops=1)
         Order By: (linetext <-> 'ladies gentlemen provinces distance'::text)
 Total runtime: 42.716 ms
(4 rows)

As you can see, it pulls the requested number of rows from the
index in the order of "distance" (inverse similarity).  It is,
however, not as fast as a tsearch2 search (43 ms instead of a
fraction of a ms), and it doesn't give you options to use AND/OR as
you might like.  It is still a lot faster than scanning the whole
table and applying the test to each row.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general





[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux