Brandon Williams <bmwill@xxxxxxxxxx> writes: > Allow ls-files to recognize submodules in order to retrieve a list of > files from a repository's submodules. This is done by forking off a > process to recursively call ls-files on all submodules. Also added an > output-path-prefix command in order to prepend paths to child processes. > > Signed-off-by: Brandon Williams <bmwill@xxxxxxxxxx> > @@ -68,6 +71,21 @@ static void write_eolinfo(const struct cache_entry *ce, const char *path) > static void write_name(const char *name) > { > /* > + * NEEDSWORK: To make this thread-safe, full_name would have to be owned > + * by the caller. > + * > + * full_name get reused across output lines to minimize the allocation > + * churn. > + */ > + static struct strbuf full_name = STRBUF_INIT; > + if (output_path_prefix != '\0') { > + strbuf_reset(&full_name); > + strbuf_addstr(&full_name, output_path_prefix); > + strbuf_addstr(&full_name, name); > + name = full_name.buf; > + } At first glance it was surprising that no test caught this lack of dereference; the reason is because you initialize output_path_prefix to an empty string, not NULL, causing full_name.buf always used, which does not have an impact on the output. I think initializing it to NULL is a more typical way to say "this option has not been given", and if you took that route, the condition would become if (output_path_prefix && *output_path_prefix) { ... In any case, the fact that only this much change was required to add output-path-prefix shows two good things: (1) the original code was already well structured, funneling any pathname we need to emit through this single function so that we can do this kind of updates, and (2) the author of the patch was competent to spot this single point that needs to be updated. Nice. > + status = run_command(&cp); > + if (status) > + exit(status); run_command()'s return value comes from either start_command() or finish_command(). These signal failure by returning a non-zero value, and in practice they are negative small integers. Feeding negative value to exit() is not quite kosher. Perhaps exit(128) to mimick as if we called die() is better. If your primary interest is to support the "find in the working tree files that are tracked, recursively in submodules" grep, I think this "when we hit a submodule, spawn a separate ls-files in there" is sufficient and a solid base to build on it. On the other hand, if you are more ambitious and "grep" is merely an example of things that can be helped by having a list of paths across module boundaries, we may want to "libify" ls-files in such a way that a single process can instantiate one or more instances of "ls-files machinery", that takes which repository to work in and other arguments that specifies which paths to report, and instead of always showing the result to the standard output, makes repeated calls to a callback function to report the discovered path and other attributes associated with the path that were asked for (the object name, values of tag_*, etc.), without spawning a separate "ls-files" process. The latter would be a lot bigger task and I do not necessarily think it is needed, but that is one possible future direction to keep in mind. Thanks, will queue with a minimum fix.