Search Postgresql Archives
Re: another seemingly simple encoding question
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
This doesn't sound like your problem, but I'll explain the
normalization issue using Korean as an example, since that seems to be
your data: There are codepoints in Unicode both for Hangul and Jamo,
so a Hangul glyph can be represented either with the single
corresponding codepoint, or as two or three Jamo codepoints. A Unicode
font would display these two alternatives identically. In any Unicode
encoding, including UTF8, these two strings would not be byte-for-byte
identical. The Unicode normalization forms are four algorithms for
normalizing the strings in such a way that they do compare identically.
Anyway, it sounds like you have the opposite problem, two strings that
are comparing equal when you think they shouldn't. I don't know that
anyone can help you unless you post an actual example of two such
strings.
- John D. Burger
MITRE
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]
[Postgresql Jobs]
[Postgresql Admin]
[Postgresql Performance]
[Linux Clusters]
[PHP Home]
[PHP on Windows]
[Kernel Newbies]
[PHP Classes]
[PHP Books]
[PHP Databases]
[Postgresql & PHP]
[Yosemite]