On 10/6/24 13:28, Andrey Stikheev wrote:
Thanks for your feedback. After looking into it further, it seems the performance issue is indeed related to the default collation settings, particularly when handling certain special characters like |<| in the glibc |strcoll_l| function. This was confirmed during my testing on Debian 12 with glibc version 2.36 (this OS and glibc are being used in our office's Docker image: https://hub.docker.com/_/postgres <https:// hub.docker.com/_/postgres>).
This is not surprising. There is a performance regression that started in glibc 2.21 with regard to sorting unicode. Test with RHEL 7.x (glibc 2.17) and I bet you will see comparable results to ICU. The best answer in the long term, IMHO, is likely to use the new built-in collation just released in Postgres 17.
-- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com