Le 23/09/2011 12:30, Alban Hertroys a écrit : > On 23 September 2011 11:47, Andrew Rose <andrew.rose@xxxxxxxxxxxxxx > <mailto:andrew.rose@xxxxxxxxxxxxxx>> wrote: > > Basic Question: In text fields, is prefix matching significantly faster > than suffix matching? > > > It does depend on what type of index you use. BTrees split off text strings, > from left to right, halving the number of records you need to scan at every > branch. For a suffix match, that's exactly the wrong way around. > Hash indexes probably don't fare any better. > I don't know enough about GIST or GIN indexes to comment on their suitability > for suffix matches, but presumably they're better at those. > > I recall doing suffix matches used to be a problem in at least earlier versions > of Postgres, but it's quite possible that the query planner is smart enough to > do the reverse match by itself nowadays (I doubt it, seeing that it would also > need to reverse the way the index is organised). > > > 2. Alternatively, I could store column 'rev_str' as a reversed version of > column 'str' and have the client produce a reversed version of x on each > query (call it r). Then the client would issue... > > SELECT * FROM tbl WHERE str LIKE 'x%' OR rev_str LIKE 'r%' > > ...which would use prefix matches only instead of requiring suffix matches. > Since I've seen this form used by others, I was wondering if it's > necessary - i.e. if databases really do perform prefix matching faster? > > 3. Is there a solution I'm unaware of with even better performance? > > > You can create a functional index on the reverse of the string, that way > omitting the need for an extra column (that needs updating as well). > > CREATE INDEX tbl_str_rev ON tbl (reverse(str)); > SELECT * FROM tbl WHERE str LIKE 'x%' OR reverse(str) LIKE 'x%'; > > See: http://www.postgresql.org/docs/9.0/static/indexes-expressional.html You can use the pg_trgm extension which lets you create gin or gist indexes on your text field. There is also wildspeed (see http://www.sai.msu.su/~megera/wiki/wildspeed). Didn't try the latter solution, but the first one gives really great result for searching partial strings. a propos, there's one thing I'd like to know, is how to set the similarity limit within pg_trgm on a server side (I'd like to have it settled to 0.2 for every new session, for instance). Regards, -- Stéphane Schildknecht http://www.loxodata.com Contact régional PostgreSQL -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general