Background: pathspecs in git can be handled differently in three places 1. log family uses tree_entry_interesting() and ce_path_match() 2. most index-related operations use match_pathspec() 3. grep uses its own pathspec_matches() Out of three, #3 provides the most advanced functionalities, while #1 has a few good optimizations, but not as powerful as #3. #2 is sort of trade-off between the other two. This series brings all the #3 goodness to #1 and #2, then kills #3. I don't want to kill #2 because it takes a list as input, while #1 takes trees (ce_path_match() takes list though). There could be different optmizations based on different input type. Summary of patches: Add struct pathspec diff-no-index: use diff_tree_setup_paths() pathspec: cache string length when initializing pathspec Convert struct diff_options to use struct pathspec tree_entry_interesting(): remove dependency on struct diff_options Move tree_entry_interesting() to tree-walk.c and export it This is unchanged from nd/struct-pathspec in pu. There is one patch from pu replaced later. glossary: define pathspec This is what I am aiming to. If I make mistakes, blame Jonathan because he mis-specifies it ;-) pathspec: mark wildcard pathspecs from the beginning >From old nd/struct-pathspec, to recognize potential wildcard pathspecs early. tree-diff.c: reserve space in "base" for pathname concatenation The (probably most) used operation in traversing trees is concatenate dirname and basename into full path (especially for wildcard matching). This requires a new buffer every time. This patch ensures that the caller prepares a writable buffer with dirname already filled. If the callee wants full path, it does not have to allocate another buffer (and does shorter memcpy). This patch is not strictly needed though. tree_entry_interesting(): factor out most matching logic For readibility of the next patches. tree_entry_interesting: support depth limit Goodness from #3. tree_entry_interesting(): support wildcard matching tree_entry_interesting(): optimize fnmatch when base is matched This is something t_e_i() lacks for so long. However, in order to make log family commands work properly, ce_path_match() also needs to learn wildcards. This changes tree_entry_interesting() interface, therefore breaks en/object-list-with-pathspec. I'll send fixes shortly. Convert ce_path_match() use to match_pathspec() So that log family now works with wildcards. pathspec: add match_pathspec_depth() This is new match_pathspec(). I don't want to replace the old one because it changes more places. But once it works, another patch to kill match_pathspec() should be easy. grep: convert to use struct pathspec grep: use match_pathspec_depth() for cache grepping grep: use preallocated buffer for grep_tree() grep: drop pathspec_matches() in favor of tree_entry_interesting() grep (especially t7810) is how I test all these. I need to write more tests to make sure things work. But for now t7810 passes. Hopefully I did not lose any optimizations in pathspec_matches(). It's time to rebase negative pathspec patches on top and get back to my narrow clone. [1] https://git.wiki.kernel.org/index.php/SoC2010Ideas#Unify_Pathspec_Semantics Documentation/glossary-content.txt | 23 ++++ builtin/diff-files.c | 2 +- builtin/diff.c | 4 +- builtin/grep.c | 200 ++++++++--------------------- builtin/log.c | 2 +- cache.h | 14 ++ diff-lib.c | 2 +- diff-no-index.c | 13 +- diff.h | 4 +- dir.c | 98 ++++++++++++++ dir.h | 4 + read-cache.c | 20 +--- revision.c | 6 +- t/t4010-diff-pathspec.sh | 14 ++ tree-diff.c | 246 ++++++++---------------------------- tree-walk.c | 186 +++++++++++++++++++++++++++ tree-walk.h | 2 + 17 files changed, 461 insertions(+), 379 deletions(-) -- 1.7.3.3.476.g10a82 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html