On Tue, Oct 16, 2012 at 8:01 PM, Sam Wong <sam@xxxxxxxxxxxx> wrote: >> On Wednesday, October 17, 2012 4:30, Merlin Moncure wrote, >> >> On Tue, Oct 16, 2012 at 3:15 AM, Sam Wong <sam@xxxxxxxxxxxx> wrote: >> > Hi communities, >> > >> > I am investigating a performance issue involved with LIKE 'xxxx%' on >> > an index in a complex query with joins. >> > >> > The problem boils down into this simple scenario---: >> > ====Scenario==== >> > My database locale is C, using UTF-8 encoding. I tested this on 9.1.6 > and 9. >> > 2.1. >> > >> > Q1. >> > SELECT * FROM shipments WHERE shipment_id LIKE '12345678%' >> > >> > Q2. >> > SELECT * FROM shipments WHERE shipment_id >= '12345678' AND >> > shipment_id < '12345679' >> > >> > ...snip... >> > >> > ====Question==== >> > Is Q1 and Q2 equivalent? From what I see and the result they seems to >> > be the same, or did I miss something? (Charset: C, Encoding: UTF-8) If >> > they are equivalent, is that a bug of the planner? >> >> They are most certainly not equivalent. What if the shipping_id is >> 12345678Z? >> >> merlin >> > But '12345678Z' is indeed >= '12345678' AND < '12345679'. Just like 'apple' > < 'apples' < 'apply' in a dictionary. Right -- I didn't visualize it properly. Still, you're asking the server to infer that since you're looking between to adjacent textual characters range bounded [) it convert the 'between' to a partial string search. That hold up logically but probably isn't worth spending cycles to do, particularly in cases of non-ascii mappable unicode characters. merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance