Durumdara: On Thu, Jan 11, 2018 at 1:23 PM, Durumdara <durumdara@xxxxxxxxx> wrote: > Dear Francesco! FrancIsco, with an I, Spanish, not Italian. > My "bug" is that I commonly used Windows environment where the default > collation is ANSI, based on Windows language which is Hungarian here > (Windows1250). I'm not sure ANSI is a collation in windows. You should read a bit about locales, as they are not the same as collations. In fact I suspect the order issues you've got are not UTF8 derived, as all your data is ASCII(7bits), which represents the same in ANSI and UTF8 or ISO-8859-whatever ( IIRC win1250 is a bastard superset of ISO8859-1 ) > But because of special characters we used UTF8 to store data in database. > I supposed that UTF8.hu_HU is working like local natural order here, and the > common ASCII chars are (like '/') in same position. You must define "local natural order". But your problem is in the locale, not in the encoding. If you managed to use win1250.hu_HU your sorting will be the same, what you need is to use UTF8.C > Python/Delphi/LibreOffice can sort these numbers correctly (based on local > ANSI sort). ANSI does not define sort order. And those three use, IIRC, C-locale like sort. You are incorrectly assuming this is the correct one. > I supposed that UTF8.hu_HU is using mostly same order which is valid here > and it contains all ASCII + ANSI characters we are using here in daily work, > and they are in very similar order. > I never thought that it can't handle normal characters in 7 bit range... It can. Your assumptions are BAD. You are not used to working with collation-aware systems like postgres, and you assume they must use the same as non-locale-awaer programs by default. This is normally never true. Also, windows is notoriously dificult to configure for locales. IIRC ( haven't used it in 15 years ) you had to go to keyboard preferences to change it. > For these numbers I can use C collation, it's ok. C collation is like sorting raw bytes, it doesn't event sort upper/lower case correctly ( Do not know how you do it in HU, but in ES we sort aA before bB, while C locale normally sorts AB..ab.. It's what non-locale aware programs use because it's dirt cheap to implement, just sort the data as unsigned byte arrays lexicographically. And in fact, IIRC, utf-8 is dessigned in a way that this works too ( You can sort by raw bytes and you get the data sorted lexicographically sorted by code points ). Another thing, if you have a column which you want sorted in C locale, it can be defined as such so it does it by default ( but if you do it with an alter column or similar you'll need a reindex probably ). As a final note, please, DO NOT TOP POST and TRIM YOUR QUOTES. Specially I do not need my signature quoted. Francisco Olarte.