Slow performance of collate "en_US.utf8"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

Thank everyone for Your answers!

It is now clear, that it is not PG issue and it will not be fixed anytime soon.

I see that with pure numbers sorting en_US.utf8 is still well behind:

explain (analyze, costs, buffers, verbose)

select gen.id::text collate "C"

from generate_series(10000, 20000) AS gen(id)

order by 1 desc;

-- 3.5 ms

explain (analyze, costs, buffers, verbose)

select gen.id::text collate "en_US.utf8"

from generate_series(10000, 20000) AS gen(id)

order by 1 desc;

-- 19.8 ms

On the other hand, when I add limit 1, the difference become much less for the reasons I do not understand:

explain (analyze, costs, buffers, verbose)

select gen.id::text collate "C"

from generate_series(10000, 20000) AS gen(id)

order by 1 desc

limit 1;

-- 1.82 ms

explain (analyze, costs, buffers, verbose)

select gen.id::text collate "en_US.utf8"

from generate_series(10000, 20000) AS gen(id)

order by 1 desc

limit 1;

-- 2.8 ms

In fact, I've got no database issues right now - just benchmarking search speed of b-tree indexes on different columns types - int4, int8, numeric, texts, uuids, and run into this corner case.

l hope to make a talk about this on one of the PG conferences some day.


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux