On 4/24/07, Andrew Brooks <arb@xxxxxxxxxxxxxxxx> wrote:
I too am struggling to get to grips with character mappings but am finding it very confusing! DOS program -> dosemu mappings -> screen mappings ($TERM, locale env vars, -T option, etc.) -> terminal then on other screen "clients" there is also the corresponding $TERM, locale env vars, screen options -> terminal character encodings, etc., plus the differences between gnome term vs xterm, ...
True, there are lots of layers, and if only one is wrong things go strange. UTF-8 is nice in this respect because it cuts a few layers. Without UTF-8, but with iso-8859-1 we have: $_internal_char_set (e.g. cp437) character 179 (0xb3), is converted to Unicode 0x2502 (Box drawings light vertical) Try to convert 2502 to $_external_char_set (default: locale charmap). No match: check if $TERM supports the PC character set, if yes, fine, output the smpch terminfo setting, then character 179, then rmpch. If smpch is not available, check if $TERM supports the alternate (VT100) character set, if yes, fine, but use smacs and acsc to determine what to output. If neither smpch nor smacs are available, output an approximation (|) Now $TERM must accurately describe the terminal, and all should be fine, if that isn't true then "man iso-8859-1" and "mc" will also often fail to display correctly. With UTF-8. the 2502 is unconditionally converted to UTF-8 \xe2\x94\x82, which is sent to the terminal without any $TERM involvement. Bart - To unsubscribe from this list: send the line "unsubscribe linux-msdos" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html