On Tue, Mar 26, 2013 at 01:49:10PM -0700, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > I timed this doing "git archive HEAD" on webkit.git before and after. It > > actually ended up not mattering much (I think because it is only the > > directories which are affected, not each individually path, so it's a > > much smaller number than you'd think). The best-of-five timing was > > slightly slower, but was within the noise. > > Interesting. Because "archive" has to incur a large I/O cost > anyway, I expected extra allocation for correctness for only the > directory paths would be dwarfed in the noise. > > I actually care more about cases other than "archive", though. Do > we even feed directory paths to the machinery? In general, no, I don't think so. That's why I tested "archive", since I knew it did. In the normal case, we should just feed file paths, meaning we only run into this code path when somebody has "foo/" in their pattern. Testing like: git ls-files -z >files time git check-attr --stdin -z -a <files >/dev/null showed a difference well within the noise. > > So I do still think it would make sense to go to a byte-limited version > > of fnmatch eventually, just for code cleanliness and predictability of > > performance, but this is really not a bad solution in the interim. > > Yes, what we do with wildmatch is a separate issue for 'master' and > upwards. Oh, agreed. I just wanted to see how much performance would be impacted for the interim. But it seems that it's not. So I think your series is the right direction, but we would want to factor out the allocation code and use it from match_pathname, as well. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html