Re: [RFC PATCH 3/3] grep: add support for grepping in submodules

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 29/09/10 15:21, Jens Lehmann wrote:
> Nice start!
> 
> 
> Am 29.09.2010 22:28, schrieb Chris Packham:
>> When the --recurse-submodules option is given git grep will search in
>> submodules as they are encountered.
> 
> As "git clone" already introduced a "--recursive" option for
> submodule recursion IMO "--recursive" should be used here too for
> consistency. (Maybe you took the idea to use "--recurse-submodules"
> from my "git-checkout-recurse-submodules" branch on github? But that
> is only used there because I didn't get around to change it yet like
> I did in the "fetch-submodules-too" branch).

I actually started with --recursive and switched to
--recurse-submodules. One thing with this is the standard grep
--recursive option which may cause some confusion if people expect git
grep to behave like normal grep. I'll switch to using --recursive for
now until someone objects to the potential confusion.

One more thought on this that has been hanging around in my mind. I
sometimes want to do something on all but one submodule, in this case
with grep I'm fairly likely to want to skip a linux repository because I
already know the thing I'm looking for is in userland. Maybe in the
future we can make --recursive take an argument that allows us to
specify/restrict which submodules get included in the command invocation.

> 
>> Signed-off-by: Chris Packham <judge.packham@xxxxxxxxx>
>> ---
>>  builtin/grep.c |   72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  grep.h         |    1 +
>>  2 files changed, 73 insertions(+), 0 deletions(-)
>>
>> diff --git a/builtin/grep.c b/builtin/grep.c
>> index 8315ff0..c9befdc 100644
>> --- a/builtin/grep.c
>> +++ b/builtin/grep.c
>> @@ -30,6 +30,9 @@ static char const * const grep_usage[] = {
>>  
>>  static int use_threads = 1;
>>  
>> +static int saved_argc;
>> +static const char **saved_argv;
>> +
>>  #ifndef NO_PTHREADS
>>  #define THREADS 8
>>  static pthread_t threads[THREADS];
>> @@ -585,6 +588,67 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
>>  	free(argv);
>>  }
>>  
>> +static const char **create_sub_grep_argv(struct grep_opt *opt, const char *path)
>> +{
>> +	#define NUM_ARGS 10
>> +	struct strbuf buf = STRBUF_INIT;
>> +	const char **argv;
>> +	int i = 0;
>> +
>> +	argv = xcalloc(NUM_ARGS, sizeof(const char *));
>> +	argv[i++] = "grep";
>> +	strbuf_addf(&buf, "--submodule-prefix=\\\"%s\\\"", path);
>> +	//argv[i++] = buf.buf;
> 
> As C++ comments are not portable they have to be avoided, but I
> assume this one here (and the unused "saved_argc" variable too) is
> a hint that this code is not working as intended yet? ;-)

Yeah this is due to my test problem I mentioned in the cover email. When
run_command gets called it ends up invoking the installed git executable
which doesn't understand my new option.

> It seems you want to use strbuf_detach() here so that this argv[]
> stays valid after the strbuf_release() at the end of this function.
> And if I'm not missing something this would not work correctly in
> the second recursion depth, as the new submodule prefix should
> be the one given to this grep command concatenated with the current
> submodule path.

I'll look into strbuf_detatch. The tricky thing will be keeping track of
what to free at the end of grep_submodule. I guess I can assume that it
will argv[1] is always going to be the dynamically allocated string.

> 
>> +	if (opt->linenum)
>> +		argv[i++] = "-n";
>> +	if (opt->invert)
>> +		argv[i++] = "-v";
>> +	if (opt->ignore_case)
>> +		argv[i++] = "-i";
>> +	if (opt->count)
>> +		argv[i++] = "-c";
>> +	if (opt->name_only)
>> +		argv[i++] = "-l";
>> +
>> +	argv[i++] = saved_argv[0];
>> +	argv[i++] = NULL;
> 
> Hm, at a quick glance it might be much easier to copy argc & argv
> in cmd_grep() before parse_options() starts manipulating it. Then
> you would only have to change/add the "--submodule-prefix" option
> as needed and would not have to deal with all possible grep option
> combinations (and for example you don't pass the recurse option
> yet, which would stop the recursion pretty soon).

Yeah this is the part I was struggling with a little. It would be easy
to save argv before any option processing but I wondered if that would
be frowned upon as an overhead for non-submodule usages.

I was thinking about doing something tricky with max-depth and
recursion. But maybe its better to keep it simple.

>> +
>> +	strbuf_release(&buf);
>> +	return argv;
>> +}
>> +
>> +static int grep_submodule(struct grep_opt *opt, const char *path)
>> +{
>> +	struct strbuf buf = STRBUF_INIT;
>> +	struct child_process cp;	
>> +	const char **argv = create_sub_grep_argv(opt, path);
>> +	const char *git_dir;
>> +	int hit = 0;
>> +
>> +	strbuf_addf(&buf, "%s/.git", path);
>> +	git_dir = read_gitfile_gently(buf.buf);
>> +	if (!git_dir)
>> +		git_dir = buf.buf;
>> +	if (!is_directory(git_dir)) {
>> +		warning("submodule %s has not been initialized\n", path);
> 
> Having a submodule which is not checked out is perfectly fine, so
> I don't think we want to issue a warning here.
> 
OK I'll remove it.

>> +		goto out_free;
>> +	}
>> +
>> +	memset(&cp, 0, sizeof(cp));
>> +	cp.argv = argv;
>> +	cp.env = local_repo_env;
>> +	cp.git_cmd = 1;
>> +	cp.no_stdin = 1;
>> +	cp.dir = path;
>> +	if (run_command(&cp) == 0)
>> +		hit = 1;
>> +out_free:
>> +	free(argv);
>> +	strbuf_release(&buf);
>> +	return hit;
>> +}
>> +
>>  static int grep_cache(struct grep_opt *opt, const char **paths, int cached)
>>  {
>>  	int hit = 0;
>> @@ -593,6 +657,10 @@ static int grep_cache(struct grep_opt *opt, const char **paths, int cached)
>>  
>>  	for (nr = 0; nr < active_nr; nr++) {
>>  		struct cache_entry *ce = active_cache[nr];
>> +		if (S_ISGITLINK(ce->ce_mode) && opt->recurse_submodules) {
>> +			hit |= grep_submodule(opt, ce->name);
>> +			continue;
>> +		}
>>  		if (!S_ISREG(ce->ce_mode))
>>  			continue;
>>  		if (!pathspec_matches(paths, ce->name, opt->max_depth))
>> @@ -929,9 +997,13 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>  		  PARSE_OPT_HIDDEN | PARSE_OPT_NOARG, help_callback },
>>  		OPT_STRING(0, "submodule-prefix", &opt.submodule_prefix, "DIR",
>>  			"prepend this to submodule path output"),
>> +		OPT_BOOLEAN(0, "recurse-submodules", &opt.recurse_submodules,
>> +			"recurse into submodules"),
>>  		OPT_END()
>>  	};
>>  
>> +	saved_argc = argc;
>> +	saved_argv = argv;
>>  	/*
>>  	 * 'git grep -h', unlike 'git grep -h <pattern>', is a request
>>  	 * to show usage information and exit.
>> diff --git a/grep.h b/grep.h
>> index d918da4..e3199c9 100644
>> --- a/grep.h
>> +++ b/grep.h
>> @@ -102,6 +102,7 @@ struct grep_opt {
>>  	unsigned post_context;
>>  	unsigned last_shown;
>>  	int show_hunk_mark;
>> +	int recurse_submodules;
>>  	void *priv;
>>  
>>  	void (*output)(struct grep_opt *opt, const void *data, size_t size);
> 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]