Brandon Williams <bmwill@xxxxxxxxxx> writes: > Currently the submodule.<name>.url config option is used to determine > if a given submodule exists and is interesting to the user. This > however doesn't work very well because the URL is a config option for > the scope of a repository, whereas the existence of a submodule is an > option scoped to the working tree. A submodule exists if it is exists, whether the user is interested in it or not. If it should be checked out in the working tree is a different matter, but that should be a logical AND between "is it of interest?" and "is the superproject tree has a gitlink for it in its working tree?". So I do not agree with "This however doesn't work" at all. I'd understand it if you said "This is cumbersome if we want to do this and that, which are different from what we have done traditionally" and explain what this and that are and how they are different. > In a future with worktree support for submodules, there will be multiple > working trees, each of which may only need a subset of the submodules > checked out. The URL (which is where the submodule repository can be > obtained) should not differ between different working trees. And this makes the motivation a bit clearer. When the user wants to have multiple worktrees for the same superproject. In such a setting, the same submodule in two worktrees typically want to have the same URL. It may be different from what the upstream suggests in the .gitmodules file, but the difference, i.e. the site specific customization of the URL, should be the same between the two worktrees. But one worktree may be and the other worktree may not be interested in that submodule, and with shared .git/config file, you cannot have submodule.<name>.url set to one value and unset at the same time. This series does not solve the "two worktrees cannot have private parts in the configuration namespace" issue, but assuming it will be solved by some other series, it anticipates that submodule.<name>.URL would want to be shared between two worktrees most of time (note that there will be users who want two separate .URL for the same submodule while sharing the object database via worktrees mechanism, and you'll need to prepare for them, too), and another "bit" that tells if the submodule is of interest would want to be private to each worktree. That is the motivation, the reason why you want .URL to stop serving the dual purpose of overriding upstream-suggested URL and indicating the submodule is interesting to the user. > It may also be convenient for users to more easily specify groups of > submodules they are interested in as apposed to running "git submodule > init <path>" on each submodule they want checked out in their working > tree. > > To this end two config options are introduced, submodule.active and > submodule.<name>.active. The submodule.active config holds a pathspec > that specifies which submodules should exist in the working tree. The > submodule.<name>.active config is a boolean flag used to indicate if > that particular submodule should exist in the working tree. And because two worktrees always share their .git/config, these new configuration variables are useless to help workflow with multiple worktrees with the current system, until "per-worktree configuration" is invented. But we prepare for that future in this step. Also submodule.active that takes pathspec and not name is an oddball (use of "name" not "path" is to prepare for a submodule whose location in the superproject changes depending on the commit in the superproject), and we need to justify with an explanation. I think you envision two cases. 1. we encourage projects to adopt a convention that submodules are grouped with leading directory, so that pathspec e.g. lib/, would cover _all_ library-ish modules to allow those who are interested in library-ish modules to set ".active = lib/" just once to say any and all modules in lib/ are interesting. 2. another convention the projects can adopt, when pathspec-attribute feature is invented, is to label submodules with attribute to group them, so that a broad pathspec with attribute requirement, e.g. .:(attr:lib), can be used to say any and all modules with 'lib' attribute are interesting. The above two points (justifications, intended uses and future plans) need to be clarified around here (and possibly in the documentation), I would think. > diff --git a/submodule.c b/submodule.c > index 0a2831d84..2b33bd70f 100644 > --- a/submodule.c > +++ b/submodule.c > @@ -217,13 +217,41 @@ void gitmodules_config_sha1(const unsigned char *commit_sha1) > int is_submodule_initialized(const char *path) > { > int ret = 0; > - const struct submodule *module = NULL; > + char *key; > + const struct string_list *sl; > + const struct submodule *module = submodule_from_path(null_sha1, path); > > - module = submodule_from_path(null_sha1, path); > + /* early return if there isn't a path->module mapping */ > + if (!module) > + return 0; > + > + /* submodule.<name>.active is set */ > + key = xstrfmt("submodule.%s.active", module->name); > + if (!git_config_get_bool(key, &ret)) { > + free(key); > + return ret; > + } > + free(key); > + > + sl = git_config_get_value_multi("submodule.active"); > > - if (module) { > - char *key = xstrfmt("submodule.%s.url", module->name); > + if (sl) { > + struct pathspec ps; > + struct argv_array args = ARGV_ARRAY_INIT; > + const struct string_list_item *item; > + > + for_each_string_list_item(item, sl) { > + argv_array_push(&args, item->string); > + } > + > + parse_pathspec(&ps, 0, 0, 0, args.argv); > + ret = match_pathspec(&ps, path, strlen(path), 0, NULL, 1); > + > + argv_array_clear(&args); > + clear_pathspec(&ps); > + } else { > char *value = NULL; > + key = xstrfmt("submodule.%s.url", module->name); > > ret = !git_config_get_string(key, &value); It probably is easier to read if you had a final "return ret" in the "if (sl) {...}" part, just like you have one for the codepath that deals with "submodule.<name>.active", and flatten the else clause. That would make it clear that we have three ways with decreasing precedence. At this point, the answer from function is even less about "is it initialized?" but about "is it of interest?" (or "is it to be initialized?"). We'd probably want a /* NEEDSWORK */ comment before the function to remind us to come up with a better name after the dust settles. > diff --git a/t/t7413-submodule-is-active.sh b/t/t7413-submodule-is-active.sh > index f18e0c925..c41b899ab 100755 > --- a/t/t7413-submodule-is-active.sh > +++ b/t/t7413-submodule-is-active.sh > @@ -28,4 +28,59 @@ test_expect_success 'is-active works with urls' ' > git -C super submodule--helper is-active sub1 > ' > > +test_expect_success 'is-active works with submodule.<name>.active config' ' > + git -C super config --bool submodule.sub1.active "false" && > + test_must_fail git -C super submodule--helper is-active sub1 && > + > + git -C super config --bool submodule.sub1.active "true" && > + git -C super config --unset submodule.sub1.URL && > + git -C super submodule--helper is-active sub1 && > + > + git -C super config submodule.sub1.URL ../sub && > + git -C super config --unset submodule.sub1.active > +' The last "unset" is done to clean the customization this test did, in order to give a predictable beginning state to the next test? If so, use test_when_finished instead of &&-cascading it at the end. > + ... > +test_expect_success 'is-active with submodule.active and submodule.<name>.active' ' > + git -C super config --add submodule.active "sub1" && > + git -C super config --bool submodule.sub1.active "false" && > + git -C super config --bool submodule.sub2.active "true" && > + > + test_must_fail git -C super submodule--helper is-active sub1 && > + git -C super submodule--helper is-active sub2 && > + > + git -C super config --unset-all submodule.active && > + git -C super config --unset submodule.sub1.active && > + git -C super config --unset submodule.sub2.active > +' Likewise for all the new tests in this patch. Thanks.