Re: Is it possible to sort strings in EBCDIC order in PostgreSQL server?

Peter Geoghegan <pg@xxxxxxx> · Tue, 12 Dec 2017 10:21:07 -0800

On Tue, Dec 12, 2017 at 5:18 AM, John McKown
<john.archie.mckown@xxxxxxxxx> wrote:
> On Tue, Dec 12, 2017 at 2:17 AM, Tsunakawa, Takayuki
> <tsunakawa.takay@xxxxxxxxxxxxxx> wrote:
>>
>> Hi Laurenz, Tom, Peter,
>>
>> Thanks for your suggestions.  The practical solution seems to be to
>> override comparison operators of char, varchar and text data types with UDFs
>> that behave as Tom mentioned.
>>
>> From: Peter Geoghegan [mailto:pg@xxxxxxx]
>> > That said, the idea of an "EBCDIC collation" seems limiting. Why
>> > should a system like DB2 for the mainframe (that happens to use EBCDIC
>> > as its encoding) not have a more natural, human-orientated collation
>> > even while using EBCDIC? ISTM that the point of using the "C" locale
>> > (with EBDIC or with UTF-8 or with any other encoding) is to get a
>> > performance benefit where the actual collation's behavior doesn't
>> > matter much to users. Are you sure it's really important to be
>> > *exactly* compatible with EBCDIC order? As long as you're paying for a
>> > custom collation, why not just use a collation that is helpful to
>> > humans?
>>
>> You are right.  I'd like to ask the customer whether and why they need
>> EBCDIC ordering.
>
>
> This is a guess on my part, based on many years on an EBCDIC system. But
> I'll bet that they are doing a conversion off of the EBCDIC system (maybe
> Db2 on z/OS) to an ASCII system (Linux or Windows) running PostgreSQL. They
> want to be able to compare the output from the existing system to the output
> on the new system. EBCDIC orders "lower case", "upper case", then "digits".

ICU supports creating custom collations that reorder upper and lower
case, or digits with scripts (e.g. Latin alphabet characters). See the
documentation -- "23.2.2.3.2. ICU collations". Advanced customization
is possible.

-- 
Peter Geoghegan