Atharva Raykar <raykar.ath@xxxxxxxxx> writes: > Let's add a new "add-clone" subcommand to `git submodule--helper` with > the goal of converting part of the shell code in git-submodule.sh > related to `git submodule add` into C code. This new subcommand clones > the repository that is to be added, and checks out to the appropriate > branch. > > This is meant to be a faithful conversion that leaves the behaviour of > 'submodule add' unchanged. The only minor change is that if a submodule name has > been supplied with a name that clashes with a local submodule, the message shown > to the user ("A git directory for 'foo' is found locally...") is prepended with > "error" for clarity. > > This is part of a series of changes that will result in all of 'submodule add' > being converted to C. > > Signed-off-by: Atharva Raykar <raykar.ath@xxxxxxxxx> > Mentored-by: Christian Couder <christian.couder@xxxxxxxxx> > Mentored-by: Shourya Shukla <shouryashukla.oo@xxxxxxxxx> > Based-on-patch-by: Shourya Shukla <shouryashukla.oo@xxxxxxxxx> > Based-on-patch-by: Prathamesh Chavan <pc44800@xxxxxxxxx> > Helped-by: Đoàn Trần Công Danh <congdanhqx@xxxxxxxxx> > --- > builtin/submodule--helper.c | 180 ++++++++++++++++++++++++++++++++++++ > git-submodule.sh | 38 +------- > 2 files changed, 181 insertions(+), 37 deletions(-) > > diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c > index d55f6262e9..21b2e9fb14 100644 > --- a/builtin/submodule--helper.c > +++ b/builtin/submodule--helper.c > @@ -2745,6 +2745,185 @@ static int module_set_branch(int argc, const char **argv, const char *prefix) > return !!ret; > } > > +struct add_data { > + const char *prefix; > + const char *branch; > + const char *reference_path; > + const char *sm_path; > + const char *sm_name; > + const char *repo; > + const char *realrepo; > + int depth; > + unsigned int force: 1; > + unsigned int quiet: 1; > + unsigned int progress: 1; > + unsigned int dissociate: 1; > +}; > +#define ADD_DATA_INIT { .depth = -1 } > + > +static void show_fetch_remotes(FILE *output, const char *sm_name, const char *git_dir_path) > +{ > + struct child_process cp_remote = CHILD_PROCESS_INIT; > + struct strbuf sb_remote_out = STRBUF_INIT; > + > + cp_remote.git_cmd = 1; > + strvec_pushf(&cp_remote.env_array, > + "GIT_DIR=%s", git_dir_path); > + strvec_push(&cp_remote.env_array, "GIT_WORK_TREE=."); > + strvec_pushl(&cp_remote.args, "remote", "-v", NULL); > + if (!capture_command(&cp_remote, &sb_remote_out, 0)) { > + char *next_line; > + char *line = sb_remote_out.buf; > + while ((next_line = strchr(line, '\n')) != NULL) { > + size_t len = next_line - line; > + if (strip_suffix_mem(line, &len, " (fetch)")) > + fprintf(output, " %.*s\n", (int)len, line); > + line = next_line + 1; Good. That's a handy helper to use here. > + } > + } > + > + strbuf_release(&sb_remote_out); > +} > + > +static int add_submodule(const struct add_data *add_data) > +{ > + char *submod_gitdir_path; > + > + /* perhaps the path already exists and is already a git repo, else clone it */ > + if (is_directory(add_data->sm_path)) { > + submod_gitdir_path = xstrfmt("%s/.git", add_data->sm_path); > + if (is_directory(submod_gitdir_path) || file_exists(submod_gitdir_path)) This is a faithful translation from the scripted Porcelain that did not have access to our C internals that can do a lot better job at deciding if a given "$path", which has ".git" directly or file below it, is the top of a working tree that is governed by that ".git" thing. I wonder if setup.c::is_nonbare_repository_dir() can be used here. You really want to see add_data->sm_path is the top of the working tree of a submodule, so requiring only that ".git" thing is either a directory or a file, which was the cheap thing the scripted version could do without expending too much engineering effort, is a shame when you are now writing in C and all the infrastructure to do much better job is available to you. > + printf(_("Adding existing path at '%s' to index\n"), > + add_data->sm_path); Unintended change to the message here; you lost "the". > + else > + die(_("'%s' already exists and is not a valid git repo"), > + add_data->sm_path); > + free(submod_gitdir_path); > + } else { > + struct strvec clone_args = STRVEC_INIT; > + struct child_process cp = CHILD_PROCESS_INIT; > + submod_gitdir_path = xstrfmt(".git/modules/%s", add_data->sm_name); > + > + if (is_directory(submod_gitdir_path)) { > + if (!add_data->force) { > + error(_("a git directory for '%s' is found " > + "locally with remote(s):"), add_data->sm_name); The original is a mere eval_gettextln, so a faithful conversion would be to emit this to the standard error stream, without letting error() to add "error: " prefix and downcasing "A git directory". > + show_fetch_remotes(stderr, add_data->sm_name, > + submod_gitdir_path); > + fprintf(stderr, > + _("If you want to reuse this local git " > + "directory instead of cloning again from\n" > + " %s\n" > + "use the '--force' option. If the local git " > + "directory is not the correct repo\n" > + "or if you are unsure what this means, choose " > + "another name with the '--name' option.\n"), > + add_data->realrepo); > + free(submod_gitdir_path); > + return 1; The updated caller in the scripted Porcelain just silently exits upon an error, so the above message that used to be given to die() needs to be emitted by us like this. You may want to do die(), instead of fprintf(stderr) followed by return 1, to be a more faithful conversion. > + } else { > + printf(_("Reactivating local git directory for " > + "submodule '%s'\n"), add_data->sm_name); > + } > + } > + free(submod_gitdir_path); And at this point, the original scripted Porcelain used to call "git submodule--helper clone". > + strvec_pushl(&clone_args, "clone", "--path", add_data->sm_path, "--name", > + add_data->sm_name, "--url", add_data->realrepo, NULL); > + if (add_data->quiet) > + strvec_push(&clone_args, "--quiet"); > + if (add_data->progress) > + strvec_push(&clone_args, "--progress"); > + if (add_data->prefix) > + strvec_pushl(&clone_args, "--prefix", add_data->prefix, NULL); > + if (add_data->reference_path) > + strvec_pushl(&clone_args, "--reference", > + add_data->reference_path, NULL); > + if (add_data->dissociate) > + strvec_push(&clone_args, "--dissociate"); > + if (add_data->depth >= 0) > + strvec_pushf(&clone_args, "--depth=%d", add_data->depth); > + > + if (module_clone(clone_args.nr, clone_args.v, add_data->prefix)) { > + strvec_clear(&clone_args); > + return -1; > + } > + strvec_clear(&clone_args); All of the above does the equivalent of it, but having to "unparse" members of the add_data structure and stuff clone_args, only to be parsed by parse_options() call in module_clone(), is somewhat unsatisfactory. I think this series would be helped greatly if there were an additional step that splits module_clone() into two, factoring out everything that happens after module_clone() calls parse_options() into a helper function. module_clone() will call that helper function, and then at this step, instead of the above "unparse and then call module_clone()", this place will call the same helper function. In any case, after doing "git submodule--helper clone", the original scripted version does "git checkout" in the newly cloned working tree. I wonder if the "module_clone" is done with "--no-checkout" option---cloning and letting a commit checked out, and checking out another commit immediately after it, feels somewhat wasteful. > + prepare_submodule_repo_env(&cp.env_array); > + cp.git_cmd = 1; > + cp.dir = add_data->sm_path; > + strvec_pushl(&cp.args, "checkout", "-f", "-q", NULL); > + > + if (add_data->branch) { > + strvec_pushl(&cp.args, "-B", add_data->branch, NULL); > + strvec_pushf(&cp.args, "origin/%s", add_data->branch); > + } > + > + if (run_command(&cp)) > + die(_("unable to checkout submodule '%s'"), add_data->sm_path); > + } > + return 0; > +} Other than all the commented parts, this looks fairly faithful conversion from the part of the scripted Porcelain that has been lost by the same patch. > +static int add_clone(int argc, const char **argv, const char *prefix) > +{ > + int force = 0, quiet = 0, dissociate = 0, progress = 0; > + struct add_data add_data = ADD_DATA_INIT; > + ... > + argc = parse_options(argc, argv, prefix, options, usage, 0); > + > + if (argc != 0) > + usage_with_options(usage, options); > + > + add_data.progress = !!progress; > + add_data.dissociate = !!dissociate; > + add_data.force = !!force; > + add_data.quiet = !!quiet; > + > + if (add_submodule(&add_data)) > + return 1; > + > + return 0; > +} As you wrote the add_clone() vs add_submodule() split of labour, the reason this one is much more nicely organized than module_clone() should be obvious to you. This part of the design of this patch is very well done. The "additional step" I hinted earlier is exactly that---by making the guts of the feature add_submodule() to work on already parsed data and a separate wrapper add_clone() that interfaces to the scripts to parse the command line arguments into a shape that is more appropriate for the inner function, you can later call the inner function add_submodule() directly without going through the unparsing-then-parsing dance from other places. If module_clone() were split into two in a similar way, then the place to call it from add_submodule() would become much cleaner. Thanks.