On Sat, 11 Sep 2021, Ingo Schwarze wrote: > When you first log into a machine, for security reasons, you have […] > connecting from. From that point onward, whatever locale(1) defaults > the sysop on the target machine may have chosen no longer apply to you. Exactly! > When connecting from a new machine, you need to check your terminal […] > for this unusual connection before you start any real work on the > remote machine. Right. You can make this a little safer by using a terminal in ASCII mode and… $ ssh -t env LC_ALL=C sh (or mksh -l or bash --login) … for that first connection. There can still be attack vectors then, but in that case/lack of trust, you probably don’t want to connect at all. > * Which terminals or terminal emulators and which modes you use This is really tricky. xterm, for example, has control sequences to trigger hardcopies and other fun, which aren’t disabled *that* easily. Moreover, not just ESC but also CSI (\x9B) triggers them. Running something like GNU screen, tmux, or perhaps even window(1) (I don’t even think this is available on GNU/Linux but OpenBSD at least had it at some point), that reduces the capabilities of your terminal, might make sense in a lack of trust scenario. Heh, isn’t that a good project idea, make a reduced-functionality terminal-in- terminal emulator. Perhaps even one that can recode like luit(1). Hmm, I guess when I ever have the time for that I could add this to screen whose codebase, horrid though it is, I’m at least somewhat familiar with… > * Which locales are available on the target machine (none of the > machines you are connecting from can know that). The C locale should be ubiquitous, though, on the other hand, it only makes the first 128 chars defined; what the upper half is is up to the system. \x9B, for example, is ¢ in cp437 so it could even show up in a welcoming message showing the prices to use the machine. > Again, SendEnv / AcceptEnv cannot *make* any of this safe. > Users need to use their brains to make their connections safe. For some reason it’s hard to get that point across ☻ But to get back to Florian’s initial question… OpenSSH by default doesn’t accept or send the locale-related environment variables, though whether this is because of forethought or simply because OpenBSD didn’t use them is up to interpretation. So accepting and sending them is a somewhat cross-distro deviation from the normal behaviour anyway and “Phasing out forwarding of locale settings” would just be returning to the upstream default, so it’s probably not questionable to do. Getting maintainers to actually do it now… > > *Especially* not $TERM with all its historic baggage, I guess. > > At least $TERM is usually set by the terminal emulator, so it usually > matches the terminal you are really using on the client side. And it’s not $TERMCAP. That’s the funny one. > Besides, the ssh_config(5) manual explains that passing it is > required by the protocol, and it is indeed not clear to me how > a pseudo terminal on the server should behave without it. Right. Having used systems from a (real or emulated) serial console, which (obviously) cannot set $TERM properly at first, this is not fun. > I don't really see the problem here. In that company, you would > obviously set all computers to a default of LC_ALL=en_US.UTF-8 I’ve proposed a C.UTF-8 around 2013 which has made its way into first eglibc then glibc in Debian, and musl, and AFAIHH FreeBSD(?), and there’s talk of supporting this more broadly (glibc upstream I hope). On the other hand, on systems where this doesn’t exist, there’s usually en_US.UTF-8 unless the system doesn’t ship all locales by default (Debian again) or is HP/UX (which needs en_US.utf8). But setting one of these as sensible default in the global shell initialisation file on all servers, allowing users to customise it in their local ones if needed, and… > tell all employees to make sure all their terminals run in UTF-8 > mode all the time, on all company and private computers they use … that, indeed solves this problem. Another thing you could do, server-side, is to guess the terminal encoding. This is fragile as hell though. Years ago I’ve come up with: • flush all I/O • output "\030\032\r\xE2\x82\xAC.\033[6n" • read back the terminal’s response ‣ 1 is probably EUC-JP, EUC-KR ‣ 3 is probably UTF-8 ‣ 4 is probably ISO-8859 ‣ 5 is probably Shift-JIS • output "\r \r" and flush This is fragile for multiple reasons. It depends on the terminal actually responding to the column enquiry, not exploding on the characters sent, etc. and (because it needs to flush, send, then wait for the response) takes a noticeable amount of time. It’ll also return the wrong cursor position if the user begins typing while this is running. Standardising on UTF-8 terminals is the way to go, in 2021 even more so than in 2006. Looking at the CVS log I’ve only written this because Linux’ vt-is-UTF8 utility is Linux-specific. > can still set LC_ALL=de_DE.UTF-8 or even LC_ALL=ja_JA.UTF-8 to their (ja_JP, I think) > > language will avoid any mismatch ... seriously? > > No, it will not, and i didn't intend to claim that. > > What matters is how people behave, not whether these variables are > passed around or not. Right. This suggestion has the greatest potential to avoid mismatches if users avoid doing some things (like running nōn-UTF-8 terminals) though. > send to it installed. Also remember that locale names are not > standardized, so your preferred locale might be installed, but using *cough* HP/UX *cough* Incidentally “locale -a” on a GNU system also shows the “.utf8” variant but there may be systems that don’t work with *that*, so… > > least reflect the *current* mode into $TERM, which already *is* both $TERM is an index into a list of terminals shipped with the server OS. Adding anything to this is a multi-year process (consider how long it took for screen to be added) and must be avoided at all cost. (There’s this Debian package called ncurses-term that ships some extra entries, so for example GNU screen in xterm has TERM=screen-xterm instead of just screen, leading to failures with all servers that don’t have this extra package installed… or simply run a different operating system. I consider installing this package harmful.) st and tmux are also usually missing etc. The termcap/terminfo databases are AFAICT also not concerned about the encoding, so this isn’t the right place. It would work if, decades ago, people had done something like “append +anything to a TERM and it’ll look it up by basename” but they didn’t and we can’t change this now. bye, //mirabilos -- Infrastrukturexperte • tarent solutions GmbH Am Dickobskreuz 10, D-53121 Bonn • http://www.tarent.de/ Telephon +49 228 54881-393 • Fax: +49 228 54881-235 HRB AG Bonn 5168 • USt-ID (VAT): DE122264941 Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg **************************************************** /⁀\ The UTF-8 Ribbon ╲ ╱ Campaign against Mit dem tarent-Newsletter nichts mehr verpassen: ╳ HTML eMail! Also, https://www.tarent.de/newsletter ╱ ╲ header encryption! **************************************************** _______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev