On Sat, Jul 27, 2013 at 10:34 PM, Janek Sendrowski <janek12@xxxxxx> wrote:
--
Hi Sergey Konoplev,
If I'm searching for a sentence like "The tiger is the largest cat species" for example.
I can only find the sentences, which include the words "tiger, largest, cat, species", but I also like to have the sentences with only three or even two of these words.
Janek
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Hi,
You may use similarity functions of pg_trgm.
Example:
=# \d+ test
Table "public.test"
Column | Type | Modifiers | Storage | Stats target | Description
--------+------+-----------+----------+--------------+-------------
col | text | | extended | |
Indexes:
"test_idx" gin (col gin_trgm_ops)
Has OIDs: no
# SELECT * FROM test;
col
-----------------------------------------
The tiger is the largest cat species
The cheetah is the fastest cat species
The peacock is the largest bird species
(3 rows)
=# SELECT show_limit();
show_limit
------------
0.3
(1 row)
=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS sml
FROM test WHERE col % 'The tiger is the largest cat species'
ORDER BY sml DESC, col;
col | sml
-----------------------------------------+----------
The tiger is the largest cat species | 1
The peacock is the largest bird species | 0.511111
The cheetah is the fastest cat species | 0.466667
(3 rows)
=# SELECT set_limit(0.5);
set_limit
-----------
0.5
(1 row)
=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS sml
FROM test WHERE col % 'The tiger is the largest cat species'
ORDER BY sml DESC, col;
col | sml
-----------------------------------------+----------
The tiger is the largest cat species | 1
The peacock is the largest bird species | 0.511111
(2 rows)
=# SELECT set_limit(0.9);
set_limit
-----------
0.9
(1 row)
=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS sml
FROM test WHERE col % 'The tiger is the largest cat species'
ORDER BY sml DESC, col;
col | sml
--------------------------------------+-----
The tiger is the largest cat species | 1
(1 row)
When you set a higher limit, you get more exact matches.
Beena Emerson