On 7/16/2020 8:25 AM, Jeff King wrote: > On Thu, Jul 16, 2020 at 07:00:08AM -0400, Jeff King wrote: > >>>> To avoid mistakes, continue to forbid repository format upgrades in v0 >>>> repositories with an unrecognized extension. This way, a v0 user >>>> using a misspelled extension field gets a chance to correct the >>>> mistake before updating to the less forgiving v1 format. >>> >>> This needs to be managed carefully. When the next extension is >>> added to the codebase, that extension may be "known" to Git, but I >>> do not think it is a good idea to honor it in v0 repository, or >>> allow upgrading v0 repository to v1 with such an extension that >>> weren't "known" to Git. For example, a topic in flight adds >>> objectformat extension and I do not think it should be honored in v0 >>> repository. >>> >>> Having said that, the approach is OK for now at the tip of tonight's >>> master, but the point is "known" vs "unknown" must be fixed right >>> with some means. E.g. tell people to throw the "new" extensions to >>> the list of "unknown extensions" in check_repo_format() when they >>> add new ones, or something. >> >> Yeah, I agree with this line of reasoning. I'd prefer to see it >> addressed now, so that we don't have to remember to do anything later. >> I.e., for this patch to put the existing known extensions into the >> "good" list for v0, locking it into place forever, and leaving the >> objectformat topic with nothing particular it needs to do. >> >> But in the name of -rc1 expediency, I'm also OK moving forward with this >> for now. > > Hmm, this is actually a bit trickier than I expected because of the way > the code is written. It's much easier to complain about extensions in a > v0 repository than it is to ignore them. But I'm not sure if that isn't > the right way to go anyway. > > The patch I came up with is below (and goes on top of Jonathan's). Even > if we decide this is the right direction, it can definitely happen > post-v2.28. > > -- >8 -- > Subject: verify_repository_format(): complain about new extensions in v0 repo > > We made the mistake in the past of respecting extensions.* even when the > repository format version was set to 0. This is bad because forgetting > to bump the repository version means that older versions of Git (which > do not know about our extensions) won't complain. I.e., it's not a > problem in itself, but it means your repository is in a state which does > not give you the protection you think you're getting from older > versions. > > For compatibility reasons, we are stuck with that decision for existing > extensions. However, we'd prefer not to extend the damage further. We > can do that by catching any newly-added extensions and complaining about > the repository format. > > Note that this is a pretty heavy hammer: we'll refuse to work with the > repository at all. A lesser option would be to ignore (possibly with a > warning) any new extensions. But because of the way the extensions are > handled, that puts the burden on each new extension that is added to > remember to "undo" itself (because they are handled before we know > for sure whether we are in a v1 repo or not, since we don't insist on a > particular ordering of config entries). > > So one option would be to rewrite that handling to record any new > extensions (and their values) during the config parse, and then only > after proceed to handle new ones only if we're in a v1 repository. But > I'm not sure if it's worth the trouble: > > - ignoring extensions is likely to end up with broken results anyway > (e.g., ignoring a proposed objectformat extension means parsing any > object data is likely to encounter errors) > > - this is a sign that whatever tool wrote the extension field is > broken. We may be better off notifying immediately and forcefully so > that such tools don't even appear to work accidentally. > > The only downside is that fixing the istuation is a little tricky, s/istuation/situation > because programs like "git config" won't want to work with the > repository. But: > > git config --file=.git/config core.repositoryformatversion 1 > > should still suffice. > > Signed-off-by: Jeff King <peff@xxxxxxxx> > --- > cache.h | 2 + > setup.c | 96 ++++++++++++++++++++++++++++++++++------- > t/t1302-repo-version.sh | 3 ++ > 3 files changed, 85 insertions(+), 16 deletions(-) > > diff --git a/cache.h b/cache.h > index 654426460c..0290849c19 100644 > --- a/cache.h > +++ b/cache.h > @@ -1044,6 +1044,7 @@ struct repository_format { > int hash_algo; > char *work_tree; > struct string_list unknown_extensions; > + struct string_list v1_only_extensions; > }; > > /* > @@ -1057,6 +1058,7 @@ struct repository_format { > .is_bare = -1, \ > .hash_algo = GIT_HASH_SHA1, \ > .unknown_extensions = STRING_LIST_INIT_DUP, \ > + .v1_only_extensions = STRING_LIST_INIT_DUP, \ > } > > /* > diff --git a/setup.c b/setup.c > index 3a81307602..c1480b2b60 100644 > --- a/setup.c > +++ b/setup.c > @@ -447,6 +447,54 @@ static int read_worktree_config(const char *var, const char *value, void *vdata) > return 0; > } > > +enum extension_result { > + EXTENSION_ERROR = -1, /* compatible with error(), etc */ > + EXTENSION_UNKNOWN = 0, > + EXTENSION_OK = 1 > +}; > + > +/* > + * Do not add new extensions to this function. It handles extensions which are > + * respected even in v0-format repositories for historical compatibility. > + */ > +enum extension_result handle_extension_v0(const char *var, > + const char *value, > + const char *ext, > + struct repository_format *data) ... > +/* > + * Record any new extensions in this function. > + */ > +enum extension_result handle_extension(const char *var, > + const char *value, > + const char *ext, > + struct repository_format *data) I like the split between these two methods to make it really clear the difference between "v0" and "v1". > struct repository_format *data = vdata; > @@ -455,23 +503,25 @@ static int check_repo_format(const char *var, const char *value, void *vdata) > if (strcmp(var, "core.repositoryformatversion") == 0) > data->version = git_config_int(var, value); > else if (skip_prefix(var, "extensions.", &ext)) { ... > + switch (handle_extension_v0(var, value, ext, data)) { > + case EXTENSION_ERROR: > + return -1; > + case EXTENSION_OK: > + return 0; > + case EXTENSION_UNKNOWN: > + break; > + } > + > + switch (handle_extension(var, value, ext, data)) { > + case EXTENSION_ERROR: > + return -1; > + case EXTENSION_OK: > + string_list_append(&data->v1_only_extensions, ext); > + return 0; > + case EXTENSION_UNKNOWN: > string_list_append(&data->unknown_extensions, ext); > + return 0; > + } > } And it makes this loop much cleaner. > @@ -613,6 +665,18 @@ int verify_repository_format(const struct repository_format *format, > return -1; > } > > + if (format->version == 0 && format->v1_only_extensions.nr) { > + int i; > + > + strbuf_addstr(err, > + _("repo version is 0, but v1-only extensions found:")); > + > + for (i = 0; i < format->v1_only_extensions.nr; i++) > + strbuf_addf(err, "\n\t%s", > + format->v1_only_extensions.items[i].string); > + return -1; > + } > + > return 0; > } > > diff --git a/t/t1302-repo-version.sh b/t/t1302-repo-version.sh > index d60c042ce8..0acabb6d11 100755 > --- a/t/t1302-repo-version.sh > +++ b/t/t1302-repo-version.sh > @@ -87,6 +87,9 @@ allow 1 > allow 1 noop > abort 1 no-such-extension > allow 0 no-such-extension > +allow 0 noop > +abort 0 noop-v1 > +allow 1 noop-v1 LGTM. Thanks, -Stolee