Alvaro Herrera <alvherre@xxxxxxxxxxxxxx> writes: > On 2022-Mar-27, Ralf Schuchardt wrote: >> linked here https://www.xoev.de/downloads-2316#StringLatin it is said, >> that the spec is a strict subset of unicode (E.1.6), and it is also >> mentioned in E.1.4, that in UTF-8 all unicode characters can be >> encoded. Therefore UTF-8 can be used to encode all DIN SPEC 91379 >> characters. > So the remaining question is whether DIN SPEC 91379 requires an > implementation to support character U+0000. If it does, then PostgreSQL > is not conformant, because that character is the only one in Unicode > that we don't support. If U+0000 is not required, then PostgreSQL is > okay. Hmm ... UTF8 as defined in RFC3629/STD63 [1] does not allow "all unicode characters to be encoded". It disallows surrogate pairs (U+D800--U+DFFF) and code points above U+10FFFF. We follow that spec, so depending on what DIN 91379 *actually* says, we might have additional reasons not to be in compliance. I don't read German unfortunately. regards, tom lane [1] http://www.faqs.org/rfcs/rfc3629.html