Re: [GSoC][RFC/PATCH v2] submodule: port subcommand foreach from shell to C

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> The First function module_foreach first parses the options present in

s/First/first/

> The second function also takes care of the recursive flag, by creating
> a saperate child_process structure and prepending "--super-prefix displaypath",

 s/saperate/separate

> Additional env variable $sm_path was added, since it was used in
> test '"submodule foreach" from subdirectory' in t7407.
> I preferred adding this, instead of changing the test case, since
> in the case of git-submodule.sh, this env variable was accessible.

Makes sense. Though this explanation could go into the commit message as well.

> I checked-out the commit 1c4fb136db (submodule foreach: skip eval for
> more than one argument, 2013-09-27), which explains that why for
> the case when argc>1, we do not use eval. But since, we are calling the
> command in a separate shell itself for all values of argc, hence IMO,
> this case need not be considered separately.

Makes sense. Though this explanation could go into the commit message as well.

> I have observed that when we recursively run a command foreach
> submodule from a subdirectory, the $path variable as finally obtained
> by this patch differs with the $path variable as observed by the
> existing git-submodule code for a nested submodule.

Uh, then the recuring is still a bit broken.

> I'll again mention that I have based my branch on
> gitster/jk/ls-files-recurse-submodules-fix, since while
> using --super-prefix in recursively calling the foreach

Oops, missed that part. (Later in the reply I hint at the patch not
applying correctly, disregard that part.)

> Also, in the function foreach_submodule, we call gitmodules_config()
> to read values from the worktree .gitmodules and then look up
> the information (in this case only the sub->name) by using
> submodule_from_path funciton. Since we don't want to
> overwrite the null_sha1 entry, only loads from .gitmodules
> and avoid overlaying with .git/config.

The null_sha1 entry is there nevertheless.

> (also, since this whole process is required only to get the value
> og submodule's name, is there some other way by which we may obtain
> the value so as to avoid this step?)

Unfortunately this is the only way to get the name. But that's right, for
just getting the name we do not need the .git/config overlay as that
is not supposed to overwrite name/path mappings.

> As currently finally exams are going on in my college, I was unable to
> work on it much and the submission got delayed.

Good luck with the finals. :)

+
+ /* Only loads from .gitmodules, no overlay with .git/config */
+ gitmodules_config();

We should do the config loading/parsing not
in the inner looping function, but just once outside.

+
+ argv_array_pushf(&cp.env_array, "name=%s", sub->name);
+ argv_array_pushf(&cp.env_array, "path=%s", displaypath);
+ argv_array_pushf(&cp.env_array, "sm_path=%s", displaypath);

You mention keeping 'sm_path' in the notes after the commit message. I would
add that part to the commit message, to explain why we have multiple variables
that have the same value. Maybe even a comment in the code:

    /* Keep sm_path for historic reasons, see tests in 091a6eb0fee. */
    .. sm_path ..

+ argv_array_pushf(&cp.env_array, "sha1=%s", sub_sha1.buf);
+ argv_array_pushf(&cp.env_array, "toplevel=%s", toplevel);
+
+ cp.use_shell = 1;
+ cp.dir = path;
+
+ for (i = 0; i < argc; i++)
+ argv_array_push(&cp.args, argv[i]);
+
+ strbuf_addstr(&file, path);
+ strbuf_addstr(&file, "/.git");

This could be oneline as:
        strbuf_addf(&file, "%s/.git", path);
+
+ if (!quiet && !access_or_warn(file.buf, R_OK, 0))
+ printf(_("Entering '%s'\n"), displaypath);
+
+ if (!access_or_warn(file.buf, R_OK, 0))

I do not think we'd want to warn about a missing .git file (which is what the
or_warn part does).

Instead I'd rather use the abstract function 'is_submodule_populated_gently'.

+ run_command(&cp);
+
+ if(recursive) {
+ struct child_process cpr = CHILD_PROCESS_INIT;
+
+ cpr.use_shell = 1;
+ cpr.dir = path;
+
+ argv_array_pushf(&cpr.args, "git");
+ argv_array_pushf(&cpr.args, "--super-prefix");
+ argv_array_push(&cpr.args, displaypath);
+ argv_array_pushf(&cpr.args, "submodule--helper");

No need for the 'f'ormat version of pusing strings if there is no formatting
going on. An alternative would be to push them all at once:

    argv_array_pushl(&cpr.args, "git", "--super-prefix", displaypath,
                     submodule--helper, NULL);

(The 'l' version of pushing accepts an arbitrary number of arguments until the
final NULL)


+ if (quiet)
+ argv_array_pushf(&cpr.args, "--quiet");
+
+ argv_array_pushf(&cpr.args, "foreach");
+ argv_array_pushf(&cpr.args, "--recursive");


+
+ for (i = 0; i < argc; i++)
+ argv_array_push(&cpr.args, argv[i]);
+
+ run_command(&cpr);

I'd think we'd need to react to the exit code here (and above), and even
stop early just as the shell version does. (To find out how it does so,
I used gitk, and followed the error line
  die "$(eval_gettext "Stopping at '\$displaypath'; script returned
non-zero status.")"
and right clicked on that line "Show origin of that line". There is a lot of
noise going on, e.g. i18n, path issues being fixed, and such, until you'd find
19a31f9c1a (git-submodule - Add 'foreach' subcommand, 2008-08-10), which
outlines more clearly its basic functionality:

    cmd_foreach()
    {
        git ls-files --stage | grep '^160000 ' |
        while read mode sha1 stage path
        do
            if test -e "$path"/.git
            then
                say "Entering '$path'"
                (cd "$path" && eval "$@") ||
                die "Stopping at '$path'; script returned non-zero status."
            fi
        done
    }

Note the && || operators, that act as
https://en.wikipedia.org/wiki/Short-circuit_evaluation


I do not think we would need the following block of code:

+ if (prefix) {
+ const char *out = NULL;
+ if (skip_prefix(prefix, list.entries[i]->name, &out)) {
+ if (out && out[0] == '/' && !out + 1)
+ return 0;
+ }
+ }

because 'module_list_compute' was given the 'prefix', such that the list only
contains list submodules within the 'prefix' area. However with the early
return inside, I wonder what this code is about? (If this is not about
scoping to the directory with prefix, maybe a comment could clarify?)

+ foreach_submodule(argc, argv, list.entries[i]->name,
+  list.entries[i]->oid.hash, prefix,
+  quiet, recursive);

As Junio suggests, the name of foreach_submodule can be improved.
Also as I think further about it, we may actually want to have 3
(instead of 2) functions:

* The outer-most function is what we have here except the for-loop.
  (i.e. it is the front-end function for "submodule--helper foreach")
* the actual looping function, as that can be re-used later from within
  C, then we would not need to spawn an extra submodule--helper, but
  could directly call the
    foreach_submodule(<list>, <function ptr>, <data ptr>)
* The inner-most function is what is currently named 'foreach_submodule'
  it "does one thing", in this case runa process in each submodule.

An example of such an approach is seen with the functions:
  'for_each_string_list(list, string_list_each_func_t, cb_data);'
  (not the macro ending in _item)
  'for_each_*' as found in cache.h

    int for_each_string_list(struct string_list *list,
                 string_list_each_func_t fn, void *cb_data)
    {
        int i, ret = 0;
        for (i = 0; i < list->nr; i++)
            if ((ret = fn(&list->items[i], cb_data)))
                break;
        return ret;
    }

That way we'd have a nice helper function in the future to iterate
over submodules with a given list.

--

The following hunk changed in upstream, which is why I could not apply
the patch.
The colliding commit is cf9e55f494 (submodule: prevent backslash expantion in
submodule names, 2017-04-07), which added just a "-r" to the "while
read mode..."

So if you'd rebase your work on top of the latest master, this will produce
a conflict as well for you.

+ git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix
"$prefix"} submodule--helper foreach ${GIT_QUIET:+--quiet}
${recursive:+--recursive} "$@"

- toplevel=$(pwd)
-
- # dup stdin so that it can be restored when running the external
- # command in the subshell (and a recursive call to this function)
- exec 3<&0
-
..

Thanks,
Stefan



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]