Re: [PROPOSAL] .gitignore syntax modification

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Oct 15, 2010, at 5:57 AM, Nguyen Thai Ngoc Duy wrote:

> On Fri, Oct 15, 2010 at 6:01 PM, Kevin Ballard <kevin@xxxxxx> wrote:
>> Got around to glancing at your patch. Looks pretty good, and it does build if you simply define EXC_FLAG_STARSTAR, though there are a few changes that are definitely necessary (a path of "*" will cause this to run off the end of the string while trying to detect "**/"). I'll have some more time next week to take a much closer look though. As for performance, I'm not particularly worried. The only performance change is if EXC_FLAG_STARSTAR is checked, in the worst-case it'll try to apply the pattern once per level of directory nesting. As this is just string twiddling, it's bound to be pretty fast, and I don't think there's any viable alternative to doing this kind of loop anyway. That said, I'd still like to support putting **/ anywhere in the pattern instead of just at the beginning, and possibly even support ** (without the trailing /).
> 
> Well that would mean reimplementing fnmatch(). I don't know, maybe
> it's not hard to do that. '*' can already match '/' if FNM_PATHNAME is
> not given. So one just needs to tell fnmatch() '**' is '*' without
> FNM_PATHNAME.
> 
> "**/" optimization can be extended to support "path/to/**/" quite
> easily as long as no wildcards are used in "path/to/" part.

I think both cases can be dealt with while still using fnmatch(). We can split the string along all instances of "**" and then match each pattern segment with fnmatch() along parts of the path. If a given segment matches part of the path, then we can assume that's correct and move on (e.g. never backtrack to the ** before it). The only specialization is the very last path segment has to match at the end of the path, and we can use slash-counting in each path segment in order to figure out how to slice up the path to pass to fnmatch().

>> If we do support ** by itself, I wonder if we should special-case having ** as the last path component of the pattern. The possible behavior change we could have is making this only match files and not directories. The use-case here is putting something like "foo/**" in the top-level .gitignore and then a few levels into foo we could put another .gitignore with an inverse pattern in order to un-ignore some deep file (or just "!foo/*/*/bar.c" inside that top-level .gitignore as well). The only way I can think of to achieve this behavior with the current gitignore is something along the lines of
>> 
>> foo/*
>> !foo/bar/
>> foo/bar/*
>> !foo/bar/baz/
>> foo/bar/baz/*
>> !foo/bar/baz/bar.c
>> 
>> And even this will only work if you know all the intermediate directories. I cannot think of any way at all right now to ignore everything in a single directory except for one file at least 1 level of nesting deeper if you don't know the names of the intermediate directories. With the proposed special-case we can say
>> 
>> foo/**
>> !foo/*/*/bar.c
>> 
>> and it will behave exactly as specified.
>> 
>> It occurs to me that we could actually tweak this slightly, to say that if a ** is encountered and there are zero slashes in the pattern after it, then it will only match files (with zero or more leading directories). This way you can have a pattern "foo/**.d" which only ignores files with the extension ".d" but will still avoid ignoring directories that end in ".d".
> 
> No idea. Seems overkill to me. But I don't use .gitignore heavily. For
> really complex ignore rules, how about allowing an external process to
> do the job? It would keep .gitignore syntax simple, yet powerful when
> needed.
> 
> A leading '|' marks an external process and can be used intermixed
> with normal patterns in .gitignore. When excluded_from_list examines a
> '|' pattern, it sends all information to the associated process' stdin
> and expects to a result code in stdout. The process is started when it
> is examined the first time and is kept alive until git process
> terminates.

That would certainly be powerful, but I don't know how much work it would take to implement. I still haven't really looked at the gitignore code yet. I think this is a good suggestion to do, but I still want to handle ** natively if possible.

-Kevin Ballard--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]