Re: How to boost performance of queries containing pattern matching characters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14/02/11 07:46, Gnanakumar wrote:
If you really need to match all those options, you can't use an index. A
substring-matching index would need to have multiple entries per
character per value (since it doesn't know what you will search for).
The index-size becomes unmanageable very quickly.

That's why I asked what you really wanted to match.
To be more specific, in fact, our current application allows to delete
email(s) with a minimum of 3 characters.  There is a note/warning also given
for application Users' before deleting, explaining the implication of this
delete action (partial&  case-insensitive, and it could be wide-ranging
too).

So, I'll ask again: do you really want to match all of those options?
Yes, as explained above, I want to match all those.

Then you can't use a simple index. If you did use an index it would probably be much slower for "com" or "yah" or "gma" and so on.

The closest you can do is something like Artur's option (or the pg_trgm module - handy since you are looking at 3-chars and up) to select likely matches combined with a separate search on '%domain.com%' to confirm that fact.

P.S. - I'd be inclined to just match the central domain parts, so for "user1@xxxxxxxxxxxxxxxxxxx" you would index "europe" and "megacorp" and only allow matching on the start of each string. Of course if your application spec says you need to match on "p.c" too then that's what you have to do.

--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux