Re: Question regarding UTF-8 data and "C" collation on definition of field of table

Dionisis Kontominas <dkontominas@xxxxxxxxx> · Mon, 6 Feb 2023 02:07:01 +0100

Because if I don't specify the collation/lctype it seems to get the default from the OS, which in my case is : English_Netherlands.1252 (database encoding UTF8). That might not be best for truly unicode content columns, so I investigated the "C" option, which also seems not  to work; might be worse. 
To reframe my question, when you expect multilingual data in a column and the database encoding is utf8, which seems to accommodate the need for storage, what could be considered as best practice (if it can exist really) for collation and lctype?    

On Mon, 6 Feb 2023 at 01:57, Ron <ronljohnsonjr@xxxxxxxxx> wrote:
Why are you specifying the collation to be "C" when the default db encoding 

is UTF8, and UTF-8 has Greek, Chinese and English encodings?

On 2/5/23 17:08, Dionisis Kontominas wrote:

> Hello all,

>

>   I have a question regarding the definition of the type of a character 

> field in a table and more specifically about its collation and UTF-8 

> characters and strings.

>

>   Let's say that the definition is for example as follows:

>

>     name character varying(8) COLLATE pg_catalog."C" NOT NULL

>

> and also assume that the database default encoding is UTF8 and also the 

> Collate and Ctype is "C"". I plan to store strings of various languages in 

> this field.

>

> Are these the correct settings that I should have used on creation of 

> the database?.

>

> Thank you in Advance!

>

> Kindest regards,

>

> Dionisis Kontominas

-- 

Born in Arizona, moved to Babylonia.