Re: Performance degradation in Index searches with special characters

Joe Conway <mail@xxxxxxxxxxxxx> · Mon, 7 Oct 2024 12:48:02 -0400

On 10/6/24 14:13, Tom Lane wrote:
Joe Conway <mail@xxxxxxxxxxxxx> writes:
This is not surprising. There is a performance regression that started 
in glibc 2.21 with regard to sorting unicode. Test with RHEL 7.x (glibc 
2.17) and I bet you will see comparable results to ICU. The best answer 
in the long term, IMHO, is likely to use the new built-in collation just 
released in Postgres 17.

It seems unrelated to unicode though --- I also reproduced the issue
in a database with LATIN1 encoding.

Whatever, it is pretty awful, but the place to be complaining to
is the glibc maintainers.  Not much we can do about it.

Yeah, my reply was imprecise.

The regression was to strcoll in general. Specifically this commit which 
purports to improve performance but demonstrably causes massive regressions:

https://sourceware.org/git/?p=glibc.git;a=commit;h=0742aef6

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com