Junio C Hamano <gitster@xxxxxxxxx> writes: > "Eric W. Biederman" <ebiederm@xxxxxxxxx> writes: > >> Today there is no sanity checking of what happens when >> extensions.objectFormat is specified multiple times. Catch confused git >> configurations by only allowing this option to be specified once. > > Hmph. I am not sure if this is worth doing, and especially only for > "objectformat". Do we intend to apply different rules other than > "you can give it only once" to other extensions, and if so where > will these rules be catalogued? I do not see particular harm to let > them follow the usual "last one wins". > > If the patch were about trying to make sure that extensions, which > are inherentaly per-repository, appear only in $GIT_DIR/config and > complain if the code gets confused and tried to read them from the > system or global configuration files, I would understand, and > strongly support such an effort, ithough. Unless I misread something, the code only checks for [extensions] in $GIT_DIR/config. > The real sanity we want to enforce is that what is reported by > running "git config extensions.objectformat" must match the object > format that is used in refs and object database. Agreed. Allowing git config extensions.objectformat to change the existing value is allowing the repository to be corrupted. > Manually futzing > the configuration file and adding an entry with a contradictory > value certainly is one way to break that sanity, and this patch may > catch such a breakage, but once we start worrying about manually > futzing the configuration file, the check added here would easily > miss if the futzing is done by replacing instead of adding, so I am > not sure if this extra code is worth its bits. > > But perhaps I am missing something and not seeing why it is worth > insisting on "last one is the first one" for this particular one. I somewhat have blinders on. There are 3 configuration options I am concerned with: extensions.objectFormat extensions.compatObjectFormat core.historicObjectFormat (or whatever name we settle on). One key concern I heard expressed in earlier reviews is that however we handle these options we handle them in such a way as to give ourselves room to rise to challenges in the future. Whatever we do with parsing we have the following logical constraints: For extensions.objectFormat: There can only be a single storage hash. For extensions.compatObjectFormat: There can be no compatibility hash, there can be a single compatibility hash, and depending how things go between now and the next hash function transition we might want multiple compatibility hashes. For core.historicObjectFormat: There can be no historic hash function, there can be a single historic hash function, there can be multiple historic hash functions. For the compatibility hash I think it is unlikely we will want to support more than one compatibility hash in practice but I can imagine a scenario where we just get into the transition from SHA-1 to SHA-256 and a serious break is discovered that requires switching to FutureHash ASAP. For historic object formats like SHA-1 will become post transition there are references embedded in commit comments, email messages, bug trackers. All kinds of places that we can not update so there is fundamentally a need to be able to find which current objects correspond to the historic names. For a project each hash function transition will create more such objects. When I looked I saw two ways within current git to specify a list of values for a single configuration option. - Give that option multiple times. - Parse the option value in such a way as to generate a list. It is my sense just specifying the compatObjectFormat multiple times to specify multiple compatibility object formats makes the most sense. Especially as all is needed today is to only allow a single value. After I had implemented the only allow once logic for compatObjectFormat I saw that objectFormat had nothing similar, and knowing it is a bug for multiple objectFormat wrote a patch to enforce only appear once for objectFormat as well. For objectFormat I don't care very much. For compatObjectFormat I truly care, and for even more for the option that allows finding the current object from a historic oid (even a truncated one) I care very much. For me the fundamental question is if we allow multiples compatibility hashes or historical hashes how do we specify them? Have the option appear more than once? A comma separated list? Whatever we decided I want to enforce that doesn't appear in current configurations so we can support for multiples later. Eric >> Signed-off-by: "Eric W. Biederman" <ebiederm@xxxxxxxxxxxx> >> --- >> setup.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/setup.c b/setup.c >> index 18927a847b86..ef9f79b8885e 100644 >> --- a/setup.c >> +++ b/setup.c >> @@ -580,6 +580,7 @@ static enum extension_result handle_extension(const char *var, >> if (!strcmp(ext, "noop-v1")) { >> return EXTENSION_OK; >> } else if (!strcmp(ext, "objectformat")) { >> + struct string_list_item *item; >> int format; >> >> if (!value) >> @@ -588,6 +589,13 @@ static enum extension_result handle_extension(const char *var, >> if (format == GIT_HASH_UNKNOWN) >> return error(_("invalid value for '%s': '%s'"), >> "extensions.objectformat", value); >> + /* Only support objectFormat being specified once. */ >> + for_each_string_list_item(item, &data->v1_only_extensions) { >> + if (!strcmp(item->string, "objectformat")) >> + return error(_("'%s' already specified as '%s'"), >> + "extensions.objectformat", >> + hash_algos[data->hash_algo].name); >> + } >> data->hash_algo = format; >> return EXTENSION_OK; >> }