Search Postgresql Archives

Re: Camel case identifiers and folding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/18/19 5:18 AM, Morris de Oryx wrote:
Sounds like I may have touched a nerve with some. If so, no offense intended!

There are cases where case-sensitivity is required or desirable, it would be silly to argue otherwise. Where you have such cases, then case-sensitive queries are great. Some RDBMS systems default to case-sensitive searches, others default to case-blind searches. I'll take that to mean that either choice has merit, and it's not foolish to prefer either. I need case-blind searches vanishingly close to 100% of the time, but other people have different conditions and may find that they almost never need case-blind searches. I find this extreme hard to imagine, but find it easy to imagine people who need case-sensitive searches quite often.

What I've been thinking about most are user-driven searches on user-oriented data. Users, at least any user I've ever had, don't want case-sensitive searches. They also don't care about diacritical characters, at least in English. I worked for a French company for many years. Diacritical searches were not always preferred, but sometimes were. It depends on your user community and their requirements and norms. Sometimes it comes down to an individual user. Options are good! I was really just trying to warn someone coming from a base-blind default about Postgres behavior because, well, it hurts if you aren't expecting it. That doesn't make Postgres wrong (it's not that kind of a choice), but it is important to know about.

I'm new to Postgres (only about a year in), and it's great. But I'm used to a case-blind search as a default. And, honestly, I can *never* remember a case when a user asked for a case-sensitive search. Ever. In 30+ years. Maybe it's just me. Just kidding, it's not just me. If you're presenting users with a search interface, you can find out by asking them. Or you can AB test a search UI where there is the option of case-sensitive/blind searching, but you randomly flip which is the default. For users, case-sensitive searches are assumed. That's what Google does. Seriously, Google === Search. It's not a hard test to run. If you find that with a case-blind search, 30% of user tick the box to make it case-sensitive, then you've got users that often do care about case-sensitive search.

And since it seems to be unclear at a few places in the discussion above: It absolutely makes sense to store data in its original form and to allow for case-sensitive searches, when required. It would be very weird to store

     Call me back Ishmael, I've gotta go...

and get back anything else, be that

     call me back ishmael, i've gotta go...

or

     CALL ME BACK ISHMAEL, I'VE GOTTA GO...

As far as I understand it in Postgres, you can:

* Use something like UPPER or LOWER in Every. Single. Search.

* Fold text to one case consistently in an index to make searches case-blind.

* Use citext.

* Teach and convince every user to enter and search for data case-sensitively correctly Every. Single. Time.

On that last point, good luck. Here's an example, I'm keen on birds. Do you write it:

Black-shouldered Kite
Black-Shouldered Kite


Since you say you're new(ish) to postgres I have to ask have you bumped into the tilde(~) operator for text? And the critical tildeAstersik ( ~* )?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux