Without further ado, the following was found: Issue: ISO 10646 → ISO/IEC 10646 "Unicode (ISO 10646) is a standard which aims to unambiguously represent " "every character in every human language. Unicode's structure permits 20.1 " "bits to encode every character. Since most computers don't include 20.1-bit " "integers, Unicode is usually encoded as 32-bit integers internally and " "either a series of 16-bit integers (UTF-16) (needing two 16-bit integers " "only when encoding certain rare characters) or a series of 8-bit bytes " "(UTF-8)." "A byte 110xxxxx is the start of a 2-byte code, and 110xxxxx 10yyyyyy is " "assembled into 00000xxx xxyyyyyy. A byte 1110xxxx is the start of a 3-byte " "code, and 1110xxxx 10yyyyyy 10zzzzzz is assembled into xxxxyyyy yyzzzzzz. " "(When UTF-8 is used to code the 31-bit ISO 10646 then this progression " "continues up to 6-byte codes.)"