similarity and operator '%'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,
I'm trying to find persons in an address database where I have built trgm-indexes on name, street, zip and city.

When I search for all four parts of the address (name, street, zip and city)

    select name, street, zip, city
    from addresses
    where name % $1
      and street % $2
      and (zip % $3 or city % $4)

everything works fine: It takes less than a second to get some (5 - 500) proposed addresses out of 500,000 addresses and the query plan shows

    Bitmap Heap Scan on addresses  (cost=168.31..1993.38 rows=524 ...
      Recheck Cond: ...
      ->  Bitmap Index Scan on ...
            Index Cond: ...

The same happens when I search only by name with

    select name, street, zip, city
    from addresses
    where name % $1

But when I rewrite this query to

    select name, street, zip, city
    from addresses
    where similarity(name, $1) > 0.3

which means exactly then same as the second example, the query plan changes to

    Seq Scan on addresses  (cost=0.00..149714.42 rows=174675 width=60)
      Filter: ...

and the query lasts about a minute.

The reason for using the similarity function in place of the '%'-operator is that I want to use different similarity values in one query:

    select name, street, zip, city
    from addresses
    where name % $1
        and street % $2
        and (zip % $3 or city % $4)
        or similarity(name, $1) > 0.8

which means: take all addresses where name, street, zip and city have little similarity _plus_ all addresses where the name matches very good.


The only way I found, was to create a temporary table from the first query, change the similarity value with set_limit() and then select the second query UNION the temporary table.

Is there a more elegant and straight forward way to achieve this result?

regards  Volker

--
Volker Böhm        Tel.: +49 4141 981155
Voßkuhl 5          mailto:volker@xxxxxxxxx
21682 Stade        http://www.vboehm.de


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux