Re: How to use path limiting (using a glob)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Wed, 11 Feb 2009, Peter Baumann wrote:

> after reading Junio's nice blog today where he explained how to use git grep
> efficiently, I saw him using a glob to match for the interesting files:
> 
> 	 $ git grep -e ';;' -- '*.c'
> 
> Is it possible to have the same feature in git diff and the revision
> machinery?

Not really. Git has two different kinds of path limiters, and they are 
really really different.

 - the "walk current index/directory recursively" kind that "git ls-files" 
   uses, which takes a 'fnmatch()' type path regexp (not a real regexp, 
   but the kind you're used to with shell)

   NOTE! On purpose, we don't set the FNM_PATHNAME, so "*.c" here is 
   different from *.c in shell (it's more like "**.c" in tcsh). IOW, * 
   matches '/' too, and will walk subdirectories.

 - the "revision limiter" pathspec. This is *not* a regexp, it's a pure 
   prefix matcher, for a very simple reason: performance.

> 	$ cd $path_to_your_git_src_dir
> 	$ git log master -p -- '*.h'
> 	.... No commit shown 
> 
> 	$ git diff --name-only v1.5.0  v1.6.0 -- '*.c'
> 
> and both don't return anything.

Yeah, in the revision matcher you can still depend on the shell 
expansion, and it will do _almost_ the right thing. So if you do

	git log master -p *.c

without the quotes, the shell expansion will work, and that in turn will 
give a set of filenames that "git log" will restrict the log to. HOWEVER, 
it's not a real wildcard - it's literally looking at what you have now in 
your current working directory, and saying "give me the logs of those 
pathnames", not "give me the logs of everything ending with .c".

We _could_ make the revision limiter understand fnmatch-style patterns, 
but quite frankly, it's very very expensive - too expensive to be useful 
for big repositories. The point about only matching prefixes is that it 
allows the revision limiter to not even walk into subdirectories that 
don't match, but if you do the "*.c" kind of pattern, now the revision 
code has to look up every tree recursively. That code is also _extremely_ 
performance-critical, so we really don't want to use fnmatch() when we can 
currently use just "memcmp()".

So yes, it's kind of odd how we have two totally different concepts of 
pathname patterns, but it's probably easiest to remember that "'git grep' 
is just special". 

		Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux