> On 09 Mar 2018, at 20:10, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > lars.schneider@xxxxxxxxxxxx writes: > >> +static const char *default_encoding = "UTF-8"; >> + >> ... >> +static const char *git_path_check_encoding(struct attr_check_item *check) >> +{ >> + const char *value = check->value; >> + >> + if (ATTR_UNSET(value) || !strlen(value)) >> + return NULL; >> + >> + if (ATTR_TRUE(value) || ATTR_FALSE(value)) { >> + error(_("working-tree-encoding attribute requires a value")); >> + return NULL; >> + } > > Hmph, so we decide to be loud but otherwise ignore an undefined > configuration? Shouldn't we rather die instead to avoid touching > the user data in unexpected ways? OK. >> + >> + /* Don't encode to the default encoding */ >> + if (!strcasecmp(value, default_encoding)) >> + return NULL; > > Is this an optimization to avoid "recode one encoding to the same > encoding" no-op overhead? Correct. > We already have the optimization in the > same spirit in may existing codepaths that has nothing to do with > w-t-e, and I think we should share the code. Two pieces of thought > comes to mind. > > One is a lot smaller in scale: Is same_encoding() sufficient for > this callsite instead of strcasecmp()? Yes! > The other one is a lot bigger: Looking at all the existing callers > of same_encoding() that call reencode_string() when it returns false, > would it make sense to drop same_encoding() and move the optimization > to reencode_string() instead? > > I suspect that the answer to the smaller one is "yes, and even if > not, it should be easy to enhance/extend same_encoding() to make it > do what we want it to, and such a change will benefit even existing > callers." The answer to the larger one is likely "the optimization > is not about skipping only reencode_string() call but other things > are subtly different among callers of same_encoding(), so such a > refactoring would not be all that useful." I agree. reencode_string() would need to signal 3 cases: 1. reencode performed 2. reencode not necessary 3. reencode failed We could model "reencode not necessary" as "char *in == char *return". However, I think this should be tackled in a separate series. Thanks Lars