Hi, Demi Marie Obenour wrote on Thu, Apr 28, 2022 at 08:29:24PM -0400: > On 4/27/22 05:40, Ingo Schwarze wrote: >> Demi Marie Obenour wrote on Tue, Apr 26, 2022 at 09:12:07PM -0400: >>> On 4/25/22 08:23, Ingo Schwarze wrote: >> In OpenBSD, we are used to the deliberate >> decision that the C library ignores all aspects of the locale except >> the character encoding, [...] > Off-topic: Why did OpenBSD make this decision? In particular, > LC_MESSAGES seems to be essential to internationalization support, > without being very problematic otherwise. I think having libc and POSIX utility programs always reliably print diagnostics in the same way, and always in US-ASCII rather than sometimes in UTF-8, is more valuable than internationalization of operating system diagnostics, both from the user perspective (predictability and comprehensibility) and from the OS maintainer perspective (code simplicity and hence better change for correctness and reliability). Even as a native German speaker, i regularly get confused when seeing German error messages because they usually feel quite incomprehensible. Besides, LC_CTYPE is essential for important functionality, but picking individual features from all the rest of LC_* for implementation isn't going to help. It will increase code complexity without really achieving internationalization (even full LC_* support is not really sufficient for complete internationalization...). So better ditch it outright than attempt some piece-meal approach. Besides, even LC_MESSAGES has features that are prone to causing trouble, for example changing the meaning of "yes" and "no". > Also, is it safe if the server uses the C locale (LC_ALL=C) and the > client uses UTF-8? Yes, because US-ASCII is a subset of UTF-8, so what a well-behaved server sends in the C locale is supposed to be a subset of what it might send in a UTF-8 locale. Of course, whether it is safe when both the server and the client use a UTF-8 locale obviously depends on the terminal or terminal emulator, but at least xterm(1) in UTF-8 mode [but not in the traditional 8-bit mode that may still be the default on some operating systems] is safe when the server runs either the C locale or a UTF-8 locale. [...] >> That said, on non-OpenBSD systems, if the locale used by a program does >> not match watch the user thinks, the *semantics* of the program may still >> screw up horribly, even if the character encoding matches. For example, >> consider user input of floating point numbers with LC_NUMERIC set to a >> cultural convention the user isn't aware of. But such issues are >> only loose related to ssh(1) and to terminal security. > When it comes to terminal security, another approach is to use > a transient tmux(1) pane or terminal window that is closed once > the session is complete. Frankly, i don't know anything about tmux(1) and simply don't know whether it can or cannot help with the topic at hand. > This assumes that the mismatch cannot be > exploited for code execution, but I would be highly surprised if it > could be, especially with the client in UTF-8 mode. xterm(1) in UTF-8 mode is quite good because it never interprets multibyte characters as in-band terminal control codes. Your mileage might vary with other terminals or emulators. Yours, Ingo _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev