Hi Roland, Roland Mainz wrote on Wed, Jan 20, 2016 at 01:13:10AM +0100: > Some generic portability comments: > 1. There are other modern encodings like GB18030 Yes, but there are no plans to support any other encodings except UTF-8 in the OpenBSD base system, so supporting other encodings would be a matter for the portable version, if at all. I will consider whether it is possible to write multibyte character support in a way that doesn't result in obfuscation (and hence loss of security) on OpenBSD and yet supports other encodings elsewhere, but i'm not yet sure that will be possible. In case of the slightest doubt, i expect OpenSSH developers will prioritize security over additonal encoding support. > (support is even mandatory for software sold to the goverment in > PRC China) I'm not aware of any plans to sell OpenSSH to the government of China, but they are of course welcome to use it for free. > 2. |wcwidth()| counts in terminal cells and not number of characters > (where one character might occupy one or more bytes), e.g. there are > characters which may occupy from zero to four terminal cells (acual > number of cells is a bit (not much) OS specific). I never heard about any characters occupying more than three cells. As far as i know, the result of wcwidth(3) is not specified by the Unicode standard, so i'm usually looking at the Perl implementation as a reference. Last time i looked there, i didn't find any actual characters occupying more than two cells, even though characters of width three might in principle be possible. > 3. I am not sure whether there is a specific byte limit for UTF-8 > in any of the standards, Yes, current Unicode limits codepoints to U+0000 to U+10FFFF, which limits UTF-8 to one to four bytes. But five and six byte UTF-8 sequences were considered in the past, so you are right that we should make sure that nothing breaks if some system has bogus support for those. > e.g. "- To support terminals larger then MAX_WINSIZE and still be > properly indented I increased the buf size to 4x the size > of MAX_WINSIZE, since the maximum size of an UTF-8 char <should> > be 4 bytes." might not be a portable assumption and I would > at least safeguard it. Yes, thank you for your comments, i have taken notes in my TODO file to check that they will not be forgotten when reviewing future patches. In particular the last one is quite important: * scp(1) comments by Roland Mainz: try to make things work even with non-UTF-8 outside OpenBSD, if easy make sure nothing breaks for wcwidth(...) > 2 make sure nothing breaks for MB_CUR_MAX > 4 Yours, Ingo _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev