Thanks for working on this. I apologize for not reviewing earlier rounds (other than v2 [1]); it's been difficult to find a block of time to do so... On Sun, Aug 30, 2015 at 3:10 PM, Michael Rappazzo <rappazzo@xxxxxxxxx> wrote: > for_each_worktree iterates through each worktree and invokes a callback > function. The main worktree (if not bare) is always encountered first, > followed by worktrees created by `git worktree add`. Why does this iteration function specially filter out a bare main worktree? If it instead unconditionally vends the main worktree (bare or not), then the caller can make its own decision about what to do with it, thus empowering the caller, rather than imposing a (possibly) arbitrary restriction upon it. For instance, the "git worktree list" command may very well want to show the main worktree, even if bare (possibly controlled by a command-line option), annotated appropriately ("[bare]"). This may be exactly the sort of information a user wants to know, and by leaving the decision up to the caller, then the caller ("git worktree list" in this example) has the opportunity to act accordingly, whereas if for_each_worktree() filters out a bare main worktree unconditionally, then the caller ("git worktree list") will never be able to offer such an option. > If the callback function returns a non-zero value, iteration stops, and > the callback's return value is returned from the for_each_worktree > function. Stepping back a bit, is a for-each-foo()-style interface desirable? This sort of interface imposes a good deal of complexity on callers, demanding a callback function and callback data (cb_data), and is generally (at least in C) more difficult to reason about than other simpler interfaces. Is such complexity warranted? An alternate, much simpler interface would be to have a function, say get_worktrees(), return an array of 'worktree' structures to the caller, which the caller would iterate over (which is a common operation in C, thus easily reasoned about). The one benefit of a for-each-foo()-style interface is that it's possible to "exit early", thus avoiding the cost of interrogating meta-data for worktrees in which the caller is not interested, however, it seems unlikely that there will be so many worktrees linked to a repository for this early exit to translate into any real savings. > Signed-off-by: Michael Rappazzo <rappazzo@xxxxxxxxx> > --- > diff --git a/builtin/worktree.c b/builtin/worktree.c > index 430b51e..7b3cb96 100644 > --- a/builtin/worktree.c > +++ b/builtin/worktree.c > @@ -26,6 +26,14 @@ static int show_only; > static int verbose; > static unsigned long expire; > > +/* > + * The signature for the callback function for the for_each_worktree() > + * function below. The memory pointed to by the callback arguments > + * is only guaranteed to be valid for the duration of a single > + * callback invocation. > + */ > +typedef int each_worktree_fn(const char *path, const char *git_dir, void *cb_data); In my review[1] of v2, I mentioned that (at some point) the "git worktree list" command might like to show additional information about the worktree other than just its location. Such information might include its tag, the checked out branch (or detached HEAD), whether it's locked (and the lock reason and whether the worktree is currently "online"), whether it can be pruned (and the prune reason). Commands such as "git worktree add" and "git checkout" need to know if a branch is already checked out in some (other) worktree, thus they also need the "branch" information. This need by clients for additional worktree meta-data suggests that the additional information ought to be encapsulated into a 'struct worktree', and that for_each_worktree() should vend a 'struct worktree' rather than vending merely the "git dir". (Alternately, the above-suggested get_worktrees() should return an array of 'struct worktree' items.) An important reason for making for_each_worktree() vend a 'struct worktree' is that it facilitates centralizing all the logic for determining and computing the extra worktree meta-data in one place, thus relieving callers of burden of having to compute the extra information themselves. (Junio alluded to this in his v5 review[2].) > static int prune_worktree(const char *id, struct strbuf *reason) > { > struct stat st; > @@ -359,6 +367,81 @@ static int add(int ac, const char **av, const char *prefix) > return add_worktree(path, branch, &opts); > } > > +/* > + * Iterate through each worktree and invoke the callback function. If > + * the callback function returns non-zero, the iteration stops, and > + * this function returns that return value > + */ > +static int for_each_worktree(each_worktree_fn fn, void *cb_data) Ultimately, code outside of builtin/worktree.c will want to benefit from the encapsulation of this worktree iteration logic. For instance, the support code in (top-level) branch.c invoked by "git worktree add" and "git checkout" to determine if a branch is already checked out in some (other) branch could avail itself of this iteration interface. As such, it would make sense to place this code at location where it can be accessed by callers other than just builtin/worktree.c. A good location for it to reside might be in a new top-level file worktree.c. It doesn't have to be part of the current patch series, but, eventually, the code in branch.c for determining if a branch is already checked out elsewhere also should migrate to the new top-level worktree.c; in particular, die_if_checked_out(), find_shared_symref(), and find_linked_symref(). > +{ > + struct strbuf worktree_path = STRBUF_INIT; > + struct strbuf worktree_git = STRBUF_INIT; > + const char *common_dir; > + int main_is_bare = 0; > + int ret = 0; > + > + common_dir = get_git_common_dir(); > + if (!strcmp(common_dir, get_git_dir())) { > + /* simple case - this is the main repo */ > + main_is_bare = is_bare_repository(); > + if (!main_is_bare) { > + strbuf_addstr(&worktree_path, get_git_work_tree()); > + } else { > + strbuf_addstr(&worktree_path, common_dir); > + } > + } else { > + strbuf_addstr(&worktree_path, common_dir); > + /* If the common_dir DOES NOT end with '/.git', then it is bare */ > + main_is_bare = !strbuf_strip_suffix(&worktree_path, "/.git"); > + } > + strbuf_addstr(&worktree_git, worktree_path.buf); I may have missed some discussion in earlier rounds (or perhaps I'm too simple-minded), but I'm confused about why this logic (and most of the rest of the function) differs so much from existing logic in branch.c:find_shared_symref() and find_linked_symref() for iterating over the worktrees and gleaning information about them. That logic in branch.c seems to do a pretty good job of reporting the worktree in which a branch is already checked out, so it's not clear why the above logic takes a different (and seemingly more complex) approach. > + if (!main_is_bare) { > + strbuf_addstr(&worktree_git, "/.git"); > + > + ret = fn(worktree_path.buf, worktree_git.buf, cb_data); > + if (ret) > + goto done; > + } > + strbuf_addstr(&worktree_git, "/worktrees"); > + > + if (is_directory(worktree_git.buf)) { As mentioned in my v2 review[1], this is_directory() invocation doesn't buy you anything. The following opendir() will either succeed or fail anyhow, so checking beforehand if 'worktree_git' is a directory is wasted work. If you eliminate is_directory(), you avoid that unnecessary work (and the rest of the code can be less deeply indented). > + DIR *dir = opendir(worktree_git.buf); > + if (dir) { > + struct stat st; > + struct dirent *d; > + size_t base_path_len = worktree_git.len; > + > + while ((d = readdir(dir)) != NULL) { > + if (!strcmp(d->d_name, ".") || !strcmp(d->d_name, "..")) > + continue; > + > + strbuf_setlen(&worktree_git, base_path_len); > + strbuf_addf(&worktree_git, "/%s/gitdir", d->d_name); > + > + if (stat(worktree_git.buf, &st)) { > + fprintf(stderr, "Unable to read worktree git dir: %s\n", worktree_git.buf); > + continue; > + } > + > + strbuf_reset(&worktree_path); > + strbuf_read_file(&worktree_path, worktree_git.buf, st.st_size); > + strbuf_strip_suffix(&worktree_path, "/.git\n"); > + > + strbuf_strip_suffix(&worktree_git, "/gitdir"); > + ret = fn(worktree_path.buf, worktree_git.buf, cb_data); > + if (ret) > + break; > + } > + } > + closedir(dir); This closedir() should be inside the `if (dir) {...}' block rather than outside. Returning to the issue of this code differing from existing code in branch.c for iterating worktrees, I'm curious as to why this functionality was implemented again from scratch here rather than re-using the existing (somewhat) well proven code and logic in branch.c:find_shared_symref() and find_linked_symref(). My v2 review[1] hinted at re-use a couple times. Considering that the existing code in branch.c for iterating worktrees has gone through several code reviews and is already (somewhat) proven, re-using that code for a general purpose worktree iterator makes lots of sense compared with writing it again from scratch, as is done here. Consequently, I, personally, would be happier to see the existing code refactored (perhaps over several patches) to the point that it can be re-used as a general purpose iterator. But, that's my opinion; I can't speak for others. > + } > +done: > + strbuf_release(&worktree_git); > + strbuf_release(&worktree_path); > + return ret; > +} [1]: http://article.gmane.org/gmane.comp.version-control.git/275528 [2]: http://article.gmane.org/gmane.comp.version-control.git/276471 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html