Lars Hjemli <hjemli@xxxxxxxxx> writes: > When working with multiple, unrelated (or loosly related) git repos, > there is often a need to locate all repos with uncommitted work and > perform some action on them (say, commit and push). Before this patch, > such tasks would require manually visiting all repositories, running > `git status` within each one and then decide what to do next. > > This mundane task can now be automated by e.g. `git for-each-repo --dirty > status`, which will find all non-bare git repositories below the current > directory (even nested ones), check if they are dirty (as defined by > `git diff --quiet && git diff --cached --quiet`), and for each dirty repo > print the path to the repo and then execute `git status` within the repo. > > The command also honours the option '--clean' which restricts the set of > repos to those which '--dirty' would skip, and '-x' which is used to > execute non-git commands. It might make sense to internally use RUN_GIT_CMD flag when the first word of the command line is 'git' as an optimization, but I am not sure it is a good idea to force the end users to think when to use -x and when not to is a good idea. In other words, I think git for-each-repo -d diff --name-only git for-each-repo -d -x ls '*.c' is less nice than letting the user say git for-each-repo -d git diff --name-only git for-each-repo -d ls '*.c' > Finally, the command to execute within each repo is optional. If none is > given, git-for-each-repo will just print the path to each repo found. And > since the command supports -z, this can be used for more advanced scripting > needs. It amounts to the same thing, but I would rather describe it as: To allow scripts to handle paths with shell-unsafe characters, support "-z" to show paths with NUL termination. Otherwise, such paths are shown with the usual c-quoting. One more thing that nobody brought up during the previous reviews is if we want to support subset of repositories by allowing the standard pathspec match mechanism. For example, git for-each-repo -d git diff --name-only -- foo/ bar/b\*z might be a way to ask "please find repositories match the given pathspecs (i.e. foo/ bar/b\*z) and run the command in the ones that are dirty". We would need to think about how to mark the end of the command though---we could borrow \; from find(1), even though find is not the best example of the UI design. I.e. git for-each-repo -d git diff --name-only \; [--] foo/ bar/b\*z with or without "--". > diff --git a/Documentation/git-for-each-repo.txt b/Documentation/git-for-each-repo.txt > new file mode 100644 > index 0000000..fb12b3f > --- /dev/null > +++ b/Documentation/git-for-each-repo.txt > @@ -0,0 +1,71 @@ > +git-for-each-repo(1) > +==================== > + > +NAME > +---- > +git-for-each-repo - Execute a git command in multiple non-bare repositories There is a separate topic in flight that turns s/git/Git/ when we refer to the system as a whole. In any case, this is no longer limited to "execute a Git command". Find non-bare Git repositories in subdirectories or Find or execute a command in non-bare Git repositories in subdirectories perhaps? > +SYNOPSIS > +-------- > +[verse] > +'git for-each-repo' [-acdxz] [command] > + > +DESCRIPTION > +----------- > +The git-for-each-repo command is used to locate all non-bare git Should be sufficient to say s/is used to locate/locates/. > +repositories within the current directory tree, and optionally > +execute a git command in each of the found repos. s/a git command/a command/; > +OPTIONS > +------- > ... > +-x:: > + Execute a genric (non-git) command in each repo. Drop this option. > +NOTES > +----- > + > +For the purpose of `git-for-each-repo`, a dirty worktree is defined as a > +worktree with uncommitted changes. Is it a definition that is different from usual? If so why does it need to be inconsistent with the rest of the system? > diff --git a/builtin/for-each-repo.c b/builtin/for-each-repo.c > new file mode 100644 > index 0000000..9333ae0 > --- /dev/null > +++ b/builtin/for-each-repo.c > @@ -0,0 +1,145 @@ > +/* > + * "git for-each-repo" builtin command. > + * > + * Copyright (c) 2013 Lars Hjemli <hjemli@xxxxxxxxx> > + */ > +#include "cache.h" > +#include "color.h" > +#include "quote.h" > +#include "builtin.h" > +#include "run-command.h" > +#include "parse-options.h" > + > +#define ALL 0 > +#define DIRTY 1 > +#define CLEAN 2 > + > +static char *color = GIT_COLOR_NORMAL; > +static int eol = '\n'; > +static int match; > +static int runopt = RUN_GIT_CMD; > + > +static const char * const builtin_foreachrepo_usage[] = { > + N_("git for-each-repo [-acdxz] [cmd]"), > + NULL > +}; > + > +static struct option builtin_foreachrepo_options[] = { > + OPT_SET_INT('a', "all", &match, N_("match both clean and dirty repositories"), ALL), > + OPT_SET_INT('c', "clean", &match, N_("only show clean repositories"), CLEAN), > + OPT_SET_INT('d', "dirty", &match, N_("only show dirty repositories"), DIRTY), > + OPT_SET_INT('x', NULL, &runopt, N_("execute generic (non-git) command"), 0), > + OPT_SET_INT('z', NULL, &eol, N_("terminate each repo path with NUL character"), 0), > + OPT_END(), > +}; > + > +static int get_repo_state(const char *dir) > +{ > + const char *diffidx[] = {"diff", "--quiet", "--cached", NULL}; > + const char *diffwd[] = {"diff", "--quiet", NULL}; > + > + if (run_command_v_opt_cd_env(diffidx, RUN_GIT_CMD, dir, NULL) != 0) > + return DIRTY; > + if (run_command_v_opt_cd_env(diffwd, RUN_GIT_CMD, dir, NULL) != 0) > + return DIRTY; > + return CLEAN; > +} > + > +static void print_repo_path(const char *path, unsigned pretty) > +{ > + if (path[0] == '.' && path[1] == '/') > + path += 2; > + if (pretty) > + color_fprintf_ln(stdout, color, "[%s]", path); This is shown before running a command in that repository. I am of two minds. It certainly is nice to be able to tell which repository each block of output lines comes from, and not requiring the command to do this themselves is a good default. However, I wonder if people would want to do something like this: git for-each-repo sh -c ' git diff --name-only | sed -e "s|^|$path/|" ' to get a consolidated view, in a way similar to how "submodule foreach" can be used. This unconditional output will get in the way for such a use case. Oh, that reminds me of another thing. Perhaps we would want to export the (relative) path to the found repository in some way to allow the commands to do this kind of thing in the first place? "submodule foreach" does this with $path, I think. > + else > + write_name_quoted(path, stdout, eol); > +} Nice. Doubly nice that you do not hardcode "color" at this point but made it into a separate variable. > +static void handle_repo(struct strbuf *path, const char **argv) > +{ > + const char *gitdir; > + int len; > + > + len = path->len; > + strbuf_addstr(path, ".git"); > + gitdir = resolve_gitdir(path->buf); > + strbuf_setlen(path, len - 1); > + if (!gitdir) > + goto done; > + if (match != ALL && match != get_repo_state(path->buf)) > + goto done; > + print_repo_path(path->buf, *argv != NULL); > + if (*argv) > + run_command_v_opt_cd_env(argv, runopt, path->buf, NULL); > +done: > + strbuf_addstr(path, "/"); OK, you get "$D/" from the caller, make it "$D/.git" to call resolve_gitdir() with, turn it to "$D" before printing and runnning, and then add "/" back. Slightly tricky but correct. > +static int walk(struct strbuf *path, int argc, const char **argv) > +{ > + DIR *dir; > + struct dirent *ent; > + struct stat st; > + size_t len; > + int has_dotgit = 0; > + struct string_list list = STRING_LIST_INIT_DUP; > + struct string_list_item *item; > + > + dir = opendir(path->buf); > + if (!dir) > + return errno; > + strbuf_addstr(path, "/"); > + len = path->len; > + while ((ent = readdir(dir))) { > + if (!strcmp(ent->d_name, ".") || !strcmp(ent->d_name, "..")) > + continue; > + if (!strcmp(ent->d_name, ".git")) { > + has_dotgit = 1; > + continue; > + } > + switch (DTYPE(ent)) { > + case DT_UNKNOWN: > + case DT_LNK: > + /* Use stat() to figure out if this path leads > + * to a directory - it's not important if it's > + * a symlink which gets us there. > + */ > + strbuf_setlen(path, len); > + strbuf_addstr(path, ent->d_name); > + if (stat(path->buf, &st) || !S_ISDIR(st.st_mode)) > + break; > + /* fallthrough */ > + case DT_DIR: > + string_list_append(&list, ent->d_name); > + break; > + } > + } > + closedir(dir); > + strbuf_setlen(path, len); > + if (has_dotgit) > + handle_repo(path, argv); > + sort_string_list(&list); > + for_each_string_list_item(item, &list) { > + strbuf_setlen(path, len); > + strbuf_addstr(path, item->string); > + walk(path, argc, argv); > + } > + string_list_clear(&list, 0); > + return 0; > +} Is the "collect-first-and-then-sort" done so that the repositories are shown in a stable order regardless of the order in which readdir() returns he entries? I am not complaining, but being curious. > diff --git a/t/t6400-for-each-repo.sh b/t/t6400-for-each-repo.sh This command does not look like "6 - the revision tree commands" to me. "7 - the porcelainish commands concerning the working tree" or "9 - the git tools" may be a better match? > new file mode 100755 > index 0000000..af02c0c > --- /dev/null > +++ b/t/t6400-for-each-repo.sh > @@ -0,0 +1,150 @@ > +#!/bin/sh > +# > +# Copyright (c) 2013 Lars Hjemli > +# > + > +test_description='Test the git-for-each-repo command' > + > +. ./test-lib.sh > + > +qname="with\"quote" > +qqname="\"with\\\"quote\"" If Windows does not have problems with paths with dq in it, then this is fine, but I dunno. Otherwise, you may want to exclude the c-quote testing from the main part of the test, and have a single test that has prerequisite for filesystems that can do this at the end of the script. > +test_expect_success "setup" ' > + test_create_repo clean && > + (cd clean && test_commit foo1) && > + git init --separate-git-dir=.cleansub clean/gitfile && > + (cd clean/gitfile && test_commit foo2 && echo bar >>foo2.t) && > + test_create_repo dirty-idx && > + (cd dirty-idx && test_commit foo3 && git rm foo3.t) && > + test_create_repo dirty-wt && > + (cd dirty-wt && mv .git .linkedgit && ln -s .linkedgit .git && Some platforms are symlink-challenged. Can we do this test without "ln -s"? SYMLINKS prereq wouldn't be very useful for the setup step, as all the remaining tests won't work without setting up the test scenario. > + test_commit foo4 && rm foo4.t) && > + test_create_repo "$qname" && > + (cd "$qname" && test_commit foo5) && > + mkdir fakedir && mkdir fakedir/.git > +' > + > +test_expect_success "without filtering, all repos are included" ' > + echo "." >expect && > + echo "clean" >>expect && > + echo "clean/gitfile" >>expect && > + echo "dirty-idx" >>expect && > + echo "dirty-wt" >>expect && > + echo "$qqname" >>expect && A single cat >expect <<-EOF . clean clean/gitfile ... $qqname EOF may be a lot easier to read (likewise for all the "expect" preparation in the rest of the script). > +test_expect_success "-z NUL-terminates each path" ' > + echo "(.)" >expect && > + echo "(clean)" >>expect && > + echo "(clean/gitfile)" >>expect && > + echo "(dirty-idx)" >>expect && > + echo "(dirty-wt)" >>expect && > + echo "($qname)" >>expect && > + git for-each-repo -z | xargs -0 printf "(%s)\n" >actual && This needs prereq on "xargs -0", but because we know we do not have any string with Q in it in the expected list of repositories, it may be simpler to do something like this: echo ".QcleanQclean/gitfileQ...$qname" >expect && git for-each-repo -z | tr "\0" Q >actual && test_cmp expect actual Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html