Re: pgsql 10.23 , different systems, same table , same plan, different Buffers: shared hit

Achilleas Mantzios <a.mantzios@xxxxxxxxxxxxxxxxxxxx> · Fri, 15 Sep 2023 23:36:48 +0300

Στις 15/9/23 22:42, ο/η Tom Lane έγραψε:
Achilleas Mantzios <a.mantzios@xxxxxxxxxxxxxxxxxxxx> writes:
Thank you, I see that both systems use en_US.UTF-8 as lc_collate and
lc_ctype,
Doesn't necessarily mean they interpret that the same way, though :-(

the below seems ok
FreeBSD :
postgres@[local]/dynacom=# select * from (values
('a'),('Z'),('_'),('.'),('0')) as qry order by column1::text;
column1
---------
_
.
0
a
Z
(5 rows)
Sadly, this proves very little about Linux's behavior.  glibc's idea
of en_US involves some very complicated multi-pass sort rules.
AFAICT from the FreeBSD sort(1) man page, FreeBSD defines en_US
as "same as C except case-insensitive", whereas I'm pretty sure
that underscores and other punctuation are nearly ignored in
glibc's interpretation; they'll only be taken into account if the

Thank you so much. Makes perfect sense.

This begs the question asked also in the -sql list : how do I index on 
regex'es, or at least have a barely scalable solution? Here I try to 
match a given string against a stored regex, whereas in pg_trgm's case 
the user tries to match a stored text against a given regex.

alphanumeric parts of the strings sort equal.

			regards, tom lane

--
Achilleas Mantzios
 IT DEV - HEAD
 IT DEPT
 Dynacom Tankers Mgmt