Re: How to use path limiting (using a glob)?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 11, 2009 at 11:40:44AM -0800, Linus Torvalds wrote:
> 
> 
> On Wed, 11 Feb 2009, Peter Baumann wrote:
> 
> > after reading Junio's nice blog today where he explained how to use git grep
> > efficiently, I saw him using a glob to match for the interesting files:
> > 
> > 	 $ git grep -e ';;' -- '*.c'
> > 
> > Is it possible to have the same feature in git diff and the revision
> > machinery?
> 
> Not really. Git has two different kinds of path limiters, and they are 
> really really different.
> 
>  - the "walk current index/directory recursively" kind that "git ls-files" 
>    uses, which takes a 'fnmatch()' type path regexp (not a real regexp, 
>    but the kind you're used to with shell)
> 
>    NOTE! On purpose, we don't set the FNM_PATHNAME, so "*.c" here is 
>    different from *.c in shell (it's more like "**.c" in tcsh). IOW, * 
>    matches '/' too, and will walk subdirectories.
> 

Hm. But if git does only anchor the * at the current directory, wouldn't
this solve (or at least reduce) the performance problems you described in the
later paragraph? Having the "**.c" do a recurisve search for every .c
file would then be used to do a recusrive search. 

>  - the "revision limiter" pathspec. This is *not* a regexp, it's a pure 
>    prefix matcher, for a very simple reason: performance.
> 
> > 	$ cd $path_to_your_git_src_dir
> > 	$ git log master -p -- '*.h'
> > 	.... No commit shown 
> > 
> > 	$ git diff --name-only v1.5.0  v1.6.0 -- '*.c'
> > 
> > and both don't return anything.
> 
> Yeah, in the revision matcher you can still depend on the shell 
> expansion, and it will do _almost_ the right thing. So if you do
> 
> 	git log master -p *.c
> 
> without the quotes, the shell expansion will work, and that in turn will 
> give a set of filenames that "git log" will restrict the log to. HOWEVER, 
> it's not a real wildcard - it's literally looking at what you have now in 
> your current working directory, and saying "give me the logs of those 
> pathnames", not "give me the logs of everything ending with .c".
> 

Ok. Thats actually the reason why I asked for this, because if a file
got removed it wouldn't be found by this.

> We _could_ make the revision limiter understand fnmatch-style patterns, 
> but quite frankly, it's very very expensive - too expensive to be useful 
> for big repositories. The point about only matching prefixes is that it 
> allows the revision limiter to not even walk into subdirectories that 
> don't match, but if you do the "*.c" kind of pattern, now the revision 
> code has to look up every tree recursively. That code is also _extremely_ 
> performance-critical, so we really don't want to use fnmatch() when we can 
> currently use just "memcmp()".
> 
> So yes, it's kind of odd how we have two totally different concepts of 
> pathname patterns, but it's probably easiest to remember that "'git grep' 
> is just special". 
> 
> 		Linus
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux