On Wed, Feb 11, 2009 at 11:40:44AM -0800, Linus Torvalds wrote: > > > On Wed, 11 Feb 2009, Peter Baumann wrote: > > > after reading Junio's nice blog today where he explained how to use git grep > > efficiently, I saw him using a glob to match for the interesting files: > > > > $ git grep -e ';;' -- '*.c' > > > > Is it possible to have the same feature in git diff and the revision > > machinery? > > Not really. Git has two different kinds of path limiters, and they are > really really different. > > - the "walk current index/directory recursively" kind that "git ls-files" > uses, which takes a 'fnmatch()' type path regexp (not a real regexp, > but the kind you're used to with shell) > > NOTE! On purpose, we don't set the FNM_PATHNAME, so "*.c" here is > different from *.c in shell (it's more like "**.c" in tcsh). IOW, * > matches '/' too, and will walk subdirectories. > Hm. But if git does only anchor the * at the current directory, wouldn't this solve (or at least reduce) the performance problems you described in the later paragraph? Having the "**.c" do a recurisve search for every .c file would then be used to do a recusrive search. > - the "revision limiter" pathspec. This is *not* a regexp, it's a pure > prefix matcher, for a very simple reason: performance. > > > $ cd $path_to_your_git_src_dir > > $ git log master -p -- '*.h' > > .... No commit shown > > > > $ git diff --name-only v1.5.0 v1.6.0 -- '*.c' > > > > and both don't return anything. > > Yeah, in the revision matcher you can still depend on the shell > expansion, and it will do _almost_ the right thing. So if you do > > git log master -p *.c > > without the quotes, the shell expansion will work, and that in turn will > give a set of filenames that "git log" will restrict the log to. HOWEVER, > it's not a real wildcard - it's literally looking at what you have now in > your current working directory, and saying "give me the logs of those > pathnames", not "give me the logs of everything ending with .c". > Ok. Thats actually the reason why I asked for this, because if a file got removed it wouldn't be found by this. > We _could_ make the revision limiter understand fnmatch-style patterns, > but quite frankly, it's very very expensive - too expensive to be useful > for big repositories. The point about only matching prefixes is that it > allows the revision limiter to not even walk into subdirectories that > don't match, but if you do the "*.c" kind of pattern, now the revision > code has to look up every tree recursively. That code is also _extremely_ > performance-critical, so we really don't want to use fnmatch() when we can > currently use just "memcmp()". > > So yes, it's kind of odd how we have two totally different concepts of > pathname patterns, but it's probably easiest to remember that "'git grep' > is just special". > > Linus -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html