On Mon, Nov 22 2021, Glen Choo wrote: > As we introduce a submodule UX with branches, we would like to be able > to get the submodule commit ids in a superproject tree because those ids > are the source of truth e.g. "git branch --recurse-submodules topic > start-point" should create branches based off the commit ids recorded in > the superproject's 'start-point' tree. > > To make this easy, introduce a submodules_of_tree() helper function that > iterates through a tree and returns the tree's gitlink entries as a > list. > > Signed-off-by: Glen Choo <chooglen@xxxxxxxxxx> > --- > submodule-config.c | 19 +++++++++++++++++++ > submodule-config.h | 13 +++++++++++++ > 2 files changed, 32 insertions(+) > > diff --git a/submodule-config.c b/submodule-config.c > index f95344028b..97da373301 100644 > --- a/submodule-config.c > +++ b/submodule-config.c > @@ -7,6 +7,7 @@ > #include "strbuf.h" > #include "object-store.h" > #include "parse-options.h" > +#include "tree-walk.h" > > /* > * submodule cache lookup structure > @@ -726,6 +727,24 @@ const struct submodule *submodule_from_path(struct repository *r, > return config_from(r->submodule_cache, treeish_name, path, lookup_path); > } > > +struct submodule_entry_list * > +submodules_of_tree(struct repository *r, const struct object_id *treeish_name) > +{ > + struct tree_desc tree; > + struct name_entry entry; > + struct submodule_entry_list *ret; > + > + CALLOC_ARRAY(ret, 1); > + fill_tree_descriptor(r, &tree, treeish_name); > + while (tree_entry(&tree, &entry)) { > + if (!S_ISGITLINK(entry.mode)) > + continue; > + ALLOC_GROW(ret->name_entries, ret->entry_nr + 1, ret->entry_alloc); > + ret->name_entries[ret->entry_nr++] = entry; > + } > + return ret; > +} > + > void submodule_free(struct repository *r) > { > if (r->submodule_cache) > diff --git a/submodule-config.h b/submodule-config.h > index 65875b94ea..4379ec77e3 100644 > --- a/submodule-config.h > +++ b/submodule-config.h > @@ -6,6 +6,7 @@ > #include "hashmap.h" > #include "submodule.h" > #include "strbuf.h" > +#include "tree-walk.h" > > /** > * The submodule config cache API allows to read submodule > @@ -67,6 +68,18 @@ const struct submodule *submodule_from_name(struct repository *r, > const struct object_id *commit_or_tree, > const char *name); > > +struct submodule_entry_list { > + struct name_entry *name_entries; > + int entry_nr; > + int entry_alloc; > +}; > + > +/** > + * Given a tree-ish, return all submodules in the tree. > + */ > +struct submodule_entry_list * > +submodules_of_tree(struct repository *r, const struct object_id *treeish_name); > + > /** > * Given a tree-ish in the superproject and a path, return the submodule that > * is bound at the path in the named tree. Having skimmed through this topic isn't this in 4/4 the only resulting caller: +void create_submodule_branches(struct repository *r, const char *name, + const char *start_name, int force, int reflog, + int quiet, enum branch_track track) +{ + int i = 0; + char *branch_point = NULL; + struct repository *subrepos; + struct submodule *submodules; + struct object_id super_oid; + struct submodule_entry_list *submodule_entry_list; + char *err_msg = NULL; + + validate_branch_start(r, start_name, track, &super_oid, &branch_point); + + submodule_entry_list = submodules_of_tree(r, &super_oid); + CALLOC_ARRAY(subrepos, submodule_entry_list->entry_nr); + CALLOC_ARRAY(submodules, submodule_entry_list->entry_nr); + + for (i = 0; i < submodule_entry_list->entry_nr; i++) { I think it would be better to just intorduce this function at the same time as its (only?) user, which also makes it clear how it's used. In this case this seems like quite a bit of over-allocation. I.e. we return a malloc'd pointer, and iterate with tree_entry(), the caller then needs to loop over that and do its own allocations of "struct repository *" and "struct submodule *". Wouldn't it be better just to have this new submodule_entry_list contain a list of not "struct name_entry", but: struct new_thingy { struct name_entry *entry; struct repository *repo; struct submodule *submodule; } Then have the caller allocate the container on the stack, pass it to this function. Maybe not, just musings while doing some light reading. I was surprised at what are effectively two loops over the same data, first allocating 1/3, then the other doing the other 2/3...