Hello, the first patch in the series fixes a possible leak of UTF-16 surrogates into the resulting UTF-8 string decoded from on-disk names. The rest of the series the cleans up the unicode conversion functions somewhat and adds support for full encoding and decoding of UTF-16 characters outside of Base Multilingual Plane (characters needing more than one UTF-16 codepoint). Review and testing is welcome. Honza