Sounds like I may have touched a nerve with
some. If so, no offense intended!
There are cases where case-sensitivity is required
or desirable, it would be silly to argue otherwise.
Where you have such cases, then case-sensitive queries
are great. Some RDBMS systems default to
case-sensitive searches, others default to case-blind
searches. I'll take that to mean that either choice
has merit, and it's not foolish to prefer either. I
need case-blind searches vanishingly close to 100% of
the time, but other people have different conditions
and may find that they almost never need case-blind
searches. I find this extreme hard to imagine, but
find it easy to imagine people who need case-sensitive
searches quite often.
What I've been thinking about most are user-driven
searches on user-oriented data. Users, at least any
user I've ever had, don't want case-sensitive
searches. They also don't care about diacritical
characters, at least in English. I worked for a French
company for many years. Diacritical searches were not
always preferred, but sometimes were. It depends on
your user community and their requirements and norms.
Sometimes it comes down to an individual user. Options
are good! I was really just trying to warn someone
coming from a base-blind default about Postgres
behavior because, well, it hurts if you aren't
expecting it. That doesn't make Postgres wrong (it's
not that kind of a choice), but it is important to
know about.
I'm new to Postgres (only about a year in), and
it's great. But I'm used to a case-blind search as a
default. And, honestly, I can *never* remember a case
when a user asked for a case-sensitive search. Ever.
In 30+ years. Maybe it's just me. Just kidding, it's
not just me. If you're presenting users with a search
interface, you can find out by asking them. Or you can
AB test a search UI where there is the option of
case-sensitive/blind searching, but you randomly flip
which is the default. For users, case-sensitive
searches are assumed. That's what Google does.
Seriously, Google === Search. It's not a hard test to
run. If you find that with a case-blind search, 30% of
user tick the box to make it case-sensitive, then
you've got users that often do care about
case-sensitive search.
And since it seems to be unclear at a few places in
the discussion above: It absolutely makes sense to
store data in its original form and to allow for
case-sensitive searches, when required. It would be
very weird to store
Call me back Ishmael, I've gotta go...
and get back anything else, be that
call me back ishmael, i've gotta go...
or
CALL ME BACK ISHMAEL, I'VE GOTTA GO...
As far as I understand it in Postgres, you can:
* Use something like UPPER or LOWER in Every.
Single. Search.
* Fold text to one case consistently in an index to
make searches case-blind.
* Use citext.
* Teach and convince every user to enter and search
for data case-sensitively correctly Every. Single.
Time.
On that last point, good luck. Here's an example,
I'm keen on birds. Do you write it:
Black-shouldered Kite
Black-Shouldered Kite