On Oct 15, 2010, at 5:57 AM, Nguyen Thai Ngoc Duy wrote: > On Fri, Oct 15, 2010 at 6:01 PM, Kevin Ballard <kevin@xxxxxx> wrote: >> Got around to glancing at your patch. Looks pretty good, and it does build if you simply define EXC_FLAG_STARSTAR, though there are a few changes that are definitely necessary (a path of "*" will cause this to run off the end of the string while trying to detect "**/"). I'll have some more time next week to take a much closer look though. As for performance, I'm not particularly worried. The only performance change is if EXC_FLAG_STARSTAR is checked, in the worst-case it'll try to apply the pattern once per level of directory nesting. As this is just string twiddling, it's bound to be pretty fast, and I don't think there's any viable alternative to doing this kind of loop anyway. That said, I'd still like to support putting **/ anywhere in the pattern instead of just at the beginning, and possibly even support ** (without the trailing /). > > Well that would mean reimplementing fnmatch(). I don't know, maybe > it's not hard to do that. '*' can already match '/' if FNM_PATHNAME is > not given. So one just needs to tell fnmatch() '**' is '*' without > FNM_PATHNAME. > > "**/" optimization can be extended to support "path/to/**/" quite > easily as long as no wildcards are used in "path/to/" part. I think both cases can be dealt with while still using fnmatch(). We can split the string along all instances of "**" and then match each pattern segment with fnmatch() along parts of the path. If a given segment matches part of the path, then we can assume that's correct and move on (e.g. never backtrack to the ** before it). The only specialization is the very last path segment has to match at the end of the path, and we can use slash-counting in each path segment in order to figure out how to slice up the path to pass to fnmatch(). >> If we do support ** by itself, I wonder if we should special-case having ** as the last path component of the pattern. The possible behavior change we could have is making this only match files and not directories. The use-case here is putting something like "foo/**" in the top-level .gitignore and then a few levels into foo we could put another .gitignore with an inverse pattern in order to un-ignore some deep file (or just "!foo/*/*/bar.c" inside that top-level .gitignore as well). The only way I can think of to achieve this behavior with the current gitignore is something along the lines of >> >> foo/* >> !foo/bar/ >> foo/bar/* >> !foo/bar/baz/ >> foo/bar/baz/* >> !foo/bar/baz/bar.c >> >> And even this will only work if you know all the intermediate directories. I cannot think of any way at all right now to ignore everything in a single directory except for one file at least 1 level of nesting deeper if you don't know the names of the intermediate directories. With the proposed special-case we can say >> >> foo/** >> !foo/*/*/bar.c >> >> and it will behave exactly as specified. >> >> It occurs to me that we could actually tweak this slightly, to say that if a ** is encountered and there are zero slashes in the pattern after it, then it will only match files (with zero or more leading directories). This way you can have a pattern "foo/**.d" which only ignores files with the extension ".d" but will still avoid ignoring directories that end in ".d". > > No idea. Seems overkill to me. But I don't use .gitignore heavily. For > really complex ignore rules, how about allowing an external process to > do the job? It would keep .gitignore syntax simple, yet powerful when > needed. > > A leading '|' marks an external process and can be used intermixed > with normal patterns in .gitignore. When excluded_from_list examines a > '|' pattern, it sends all information to the associated process' stdin > and expects to a result code in stdout. The process is started when it > is examined the first time and is kept alive until git process > terminates. That would certainly be powerful, but I don't know how much work it would take to implement. I still haven't really looked at the gitignore code yet. I think this is a good suggestion to do, but I still want to handle ** natively if possible. -Kevin Ballard-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html