>How can we boost performance of queries containing pattern matching >characters? In my case, we're using a percent sign (%) that matches any string of zero or more characters. > > QUERY: DELETE FROM MYTABLE WHERE EMAIL ILIKE '%domain.com%' > > EMAIL column is VARCHAR(256). > > As it is clear from the above query, email is matched "partially and case-insensitively", which my application requirement demands. > > In case, if it were a full match, I could easily define a functional > INDEX on EMAIL column (lower(EMAIL)) and I could rewrite my DELETE where criteria like lower(EMAIL) = 'someemail@xxxxxxxxxx'. > > MYTABLE currently contains 2 million records and grows consistently. I had almost the same problem. To resolve it, I created my own text search parser (myftscfg) which divides text in column into three letters parts, for example: someemail@xxxxxxxxxx is divided to som, ome,mee,eem,ema,mai,ail,il@, l@d,@do,dom,oma,mai,ain,in.,n.c,.co,com There should be also index on email column: CREATE INDEX "email _fts" on mytable using gin (to_tsvector('myftscfg'::regconfig, email)) Every query like email ilike '%domain.com%' should be rewrited to: WHERE to_tsvector('myftscfg',email) @@ to_tsquery('dom') AND to_tsvector('myftscfg',email) @@ to_tsquery('oma') AND to_tsvector('myftscfg',email) @@ to_tsquery('mai') AND to_tsvector('myftscfg',email) @@ to_tsquery('ain') AND to_tsvector('myftscfg',email) @@ to_tsquery('in.') AND to_tsvector('myftscfg',email) @@ to_tsquery('n.c') AND to_tsvector('myftscfg',email) @@ to_tsquery('.co') AND to_tsvector('myftscfg',email) @@ to_tsquery('com') AND email ILIKE '%domain.com%'; Index is reducing number of records and clause email ILIKE '%domain.com%' is selecting only valid records. I didn't found better solution. ------------------------------------------- Artur Zajac -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance