> On 06 Mar 2018, at 21:42, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: > > On Sun, Mar 4, 2018 at 3:14 PM, <lars.schneider@xxxxxxxxxxxx> wrote: >> Git recognizes files encoded with ASCII or one of its supersets (e.g. >> UTF-8 or ISO-8859-1) as text files. All other encodings are usually >> interpreted as binary and consequently built-in Git text processing >> tools (e.g. 'git diff') as well as most Git web front ends do not >> visualize the content. >> [...] >> Signed-off-by: Lars Schneider <larsxschneider@xxxxxxxxx> >> --- >> diff --git a/convert.c b/convert.c >> @@ -978,6 +1051,25 @@ static int ident_to_worktree(const char *path, const char *src, size_t len, >> +static const char *git_path_check_encoding(struct attr_check_item *check) >> +{ >> + [...] >> + /* >> + * Ensure encoding names are always upper case (e.g. UTF-8) to >> + * simplify subsequent string comparisons. >> + */ >> + return xstrdup_toupper(value); > > xstrdup_toupper() allocates memory... > >> +} >> @@ -1033,6 +1125,7 @@ struct conv_attrs { >> enum crlf_action attr_action; /* What attr says */ >> enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */ >> int ident; >> + const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */ > > ...which is assigned to 'const char *'... > >> }; >> @@ -1064,6 +1158,7 @@ static void convert_attrs(struct conv_attrs *ca, const char *path) >> else if (eol_attr == EOL_CRLF) >> ca->crlf_action = CRLF_TEXT_CRLF; >> } >> + ca->working_tree_encoding = git_path_check_encoding(ccheck + 5); > > ...by this code, and eventually leaked. > > It's too bad it isn't cleaned up (freed), but looking at the callers, > fixing this leak would be mildly noisy (though not particularly > invasive). How much do we care about this leak? Hmm. You are right. That was previously handled by the encoding struct linked list that I removed in this iteration. I forgot about that aspect :/ I don't like it leaking. I think I would like to reintroduce the linked list. This way every encoding is only once in memory. What do you think? >> } else { >> ca->drv = NULL; >> ca->crlf_action = CRLF_UNDEFINED; >> diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh >> @@ -0,0 +1,135 @@ >> +test_expect_success 'check $GIT_DIR/info/attributes support' ' >> + test_when_finished "rm -f test.utf8.raw test.utf32.raw test.utf32.git" && > > It seems weird to be cleaning up files this test didn't create > (test.utf8.raw and test.utf32.raw). Agreed. >> + test_when_finished "git reset --hard HEAD" && >> + >> + echo "*.utf32 text working-tree-encoding=utf-32" >.git/info/attributes && >> + git add test.utf32 && >> + >> + git cat-file -p :test.utf32 >test.utf32.git && >> + test_cmp_bin test.utf8.raw test.utf32.git >> +' >> + >> +test_expect_success 'check unsupported encodings' ' >> + test_when_finished "rm -f err.out" && >> + test_when_finished "git reset --hard HEAD" && > > Resetting to HEAD here is an important cleanup action, but tests don't > usually clean up files such as 'err.out' since such detritus doesn't > usually impact subsequent tests negatively. (Just an observation; no > re-roll needed.) OK. I'll fix it if I reroll. - Lars