On 14/02/11 07:46, Gnanakumar wrote:
If you really need to match all those options, you can't use an index. A
substring-matching index would need to have multiple entries per
character per value (since it doesn't know what you will search for).
The index-size becomes unmanageable very quickly.
That's why I asked what you really wanted to match.
To be more specific, in fact, our current application allows to delete
email(s) with a minimum of 3 characters. There is a note/warning also given
for application Users' before deleting, explaining the implication of this
delete action (partial& case-insensitive, and it could be wide-ranging
too).
So, I'll ask again: do you really want to match all of those options?
Yes, as explained above, I want to match all those.
Then you can't use a simple index. If you did use an index it would
probably be much slower for "com" or "yah" or "gma" and so on.
The closest you can do is something like Artur's option (or the pg_trgm
module - handy since you are looking at 3-chars and up) to select likely
matches combined with a separate search on '%domain.com%' to confirm
that fact.
P.S. - I'd be inclined to just match the central domain parts, so for
"user1@xxxxxxxxxxxxxxxxxxx" you would index "europe" and "megacorp" and
only allow matching on the start of each string. Of course if your
application spec says you need to match on "p.c" too then that's what
you have to do.
--
Richard Huxton
Archonet Ltd
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance