From: Robin Rosenberg <robin.rosenberg@xxxxxxxxx> Since there is some interest in the topic, now, I can republish my old 2 ½ year old patches so there is some real code to comment on. They apply on top of 6dcfa306f2b67b733a7eb2d7ded1bc9987809edb, For completness I send all patches, but the interesing stuff is in patch 4 and 5. Beware of encoding issues with the test cases. They do not handle Windows UTF-16 at all, but I think that is just a matter of writing windows specifc wrappers for the filename and directory handling routines. Feel free to rewamp and steal ideas and add constructive criticism. Don't even think of cherry-picking and rebasing, It's careful handpicking with copy/paste at best, but mostly it's fuel for discussions. I'd admit some parts are quite kludgy and probably slow. as I was primarily interested to see if it was even feasible, which it was. however there was simply no interest, which meant there was no point in optimizing it. It was simply the wrong problem at the time. Disclaimer: A problem with this approach is that, although it does character conversion, if you are on a non-UTF-8 locale it will not let you mange any repository. That is basically impossible and hence not the goal. It does help people with the same (or close) languages to cooperate without enforcing a common encoding as long as stick to the common characters, i.e. the ones that can be converted between the locales involved. This is probably the most out-dated patch series ever. -- robin Robin Rosenberg (8): (mostly obsolete) UTF helpers Messages in locale. Extend tests to cover locale wrt to commit messages. The interesing stuff (patch 4 & 5) UTF file names. Extend all tests to work on UTF-8 filenames. old wip test of utf_locallinks Convert symlink dest in diff UTF-8 in non-SHA1-objects Makefile | 8 +- builtin-add.c | 5 +- builtin-cat-file.c | 6 +- builtin-checkout-index.c | 46 +++- builtin-commit-tree.c | 9 +- builtin-ls-files.c | 26 ++- builtin-ls-tree.c | 16 +- builtin-rev-parse.c | 7 +- builtin-update-index.c | 18 +- builtin-write-tree.c | 5 +- diff.c | 111 ++++++-- dir.c | 22 +- git-commit.sh | 5 + git-compat-util.h | 43 +++ git-rebase.sh | 1 + git.c | 9 + log-tree.c | 4 +- merge-index.c | 25 ++- read-cache.c | 8 +- refs.c | 11 +- setup.c | 28 ++- t/lib-read-tree-m-3way.sh | 38 ++-- t/t-utf-filenames.sh | 95 +++++++ t/t-utf-msg.sh | 43 +++ t/t0000-basic.sh | 117 ++++---- t/t0010-racy-git.sh | 10 +- t/t1000-read-tree-m-3way.sh | 240 +++++++++--------- t/t1001-read-tree-m-2way.sh | 56 ++-- t/t1020-subdirectory.sh | 63 +++--- t/t1100-commit-tree-options.sh | 12 +- t/t1400-update-ref.sh | 10 +- t/t2000-checkout-cache-clash.sh | 18 +- t/t2001-checkout-cache-clash.sh | 30 +- t/t2002-checkout-cache-u.sh | 8 +- t/t2003-checkout-cache-mkdir.sh | 118 ++++---- t/t2004-checkout-cache-temp.sh | 144 +++++----- t/t2100-update-cache-badpath.sh | 48 ++-- t/t2101-update-index-reupdate.sh | 56 ++-- t/t3000-ls-files-others.sh | 36 ++-- t/t3002-ls-files-dashpath.sh | 24 +- t/t3010-ls-files-killed-modified.sh | 104 ++++---- t/t3020-ls-files-error-unmatch.sh | 10 +- t/t3100-ls-tree-restrict.sh | 122 +++++----- t/t3101-ls-tree-dirname.sh | 88 +++--- t/t3400-rebase.sh | 18 +- t/t3401-rebase-partial.sh | 24 +- t/t3402-rebase-merge.sh | 17 +- t/t3403-rebase-skip.sh | 10 +- t/t3500-cherry.sh | 26 +- t/t3600-rm.sh | 28 +- t/t3700-add.sh | 30 +- t/t4000-diff-format.sh | 26 +- t/t4001-diff-rename.sh | 20 +- t/t4002-diff-basic.sh | 160 ++++++------ t/t4003-diff-rename-1.sh | 66 +++--- t/t4004-diff-rename-symlink.sh | 40 ++-- t/t4005-diff-rename-2.sh | 54 ++-- t/t4006-diff-mode.sh | 14 +- t/t4008-diff-break-rewrite.sh | 100 ++++---- t/t4009-diff-rename-4.sh | 63 +++--- t/t4011-diff-symlink.sh | 38 ++-- t/t4012-diff-binary.sh | 16 +- t/t7301-rev-parse.sh | 20 ++ t/test-lib.sh | 13 +- test-utf.c | 61 +++++ utf.c | 501 +++++++++++++++++++++++++++++++++++ utf.h | 27 ++ 67 files changed, 2133 insertions(+), 1142 deletions(-) create mode 100755 t/t-utf-filenames.sh create mode 100755 t/t-utf-msg.sh create mode 100755 t/t7301-rev-parse.sh create mode 100644 test-utf.c create mode 100644 utf.c create mode 100644 utf.h -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html