[PATCH 00/19] nd/struct-pathspec (or pathspec unification [1])

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Background:

pathspecs in git can be handled differently in three places

 1. log family uses tree_entry_interesting() and ce_path_match()
 2. most index-related operations use match_pathspec()
 3. grep uses its own pathspec_matches()

Out of three, #3 provides the most advanced functionalities, while #1
has a few good optimizations, but not as powerful as #3. #2 is sort of
trade-off between the other two.

This series brings all the #3 goodness to #1 and #2, then kills #3. I
don't want to kill #2 because it takes a list as input, while #1 takes
trees (ce_path_match() takes list though). There could be different
optmizations based on different input type.

Summary of patches:

  Add struct pathspec
  diff-no-index: use diff_tree_setup_paths()
  pathspec: cache string length when initializing pathspec
  Convert struct diff_options to use struct pathspec
  tree_entry_interesting(): remove dependency on struct diff_options
  Move tree_entry_interesting() to tree-walk.c and export it

This is unchanged from nd/struct-pathspec in pu. There is one patch
from pu replaced later.

  glossary: define pathspec

This is what I am aiming to. If I make mistakes, blame Jonathan
because he mis-specifies it ;-)

  pathspec: mark wildcard pathspecs from the beginning

>From old nd/struct-pathspec, to recognize potential wildcard pathspecs
early.

  tree-diff.c: reserve space in "base" for pathname concatenation

The (probably most) used operation in traversing trees is concatenate
dirname and basename into full path (especially for wildcard matching).
This requires a new buffer every time. This patch ensures that the
caller prepares a writable buffer with dirname already filled. If the
callee wants full path, it does not have to allocate another buffer
(and does shorter memcpy).

This patch is not strictly needed though.

  tree_entry_interesting(): factor out most matching logic

For readibility of the next patches.

  tree_entry_interesting: support depth limit

Goodness from #3.

  tree_entry_interesting(): support wildcard matching
  tree_entry_interesting(): optimize fnmatch when base is matched

This is something t_e_i() lacks for so long. However, in order to make
log family commands work properly, ce_path_match() also needs to learn
wildcards.

This changes tree_entry_interesting() interface, therefore breaks
en/object-list-with-pathspec. I'll send fixes shortly.

  Convert ce_path_match() use to match_pathspec()

So that log family now works with wildcards.

  pathspec: add match_pathspec_depth()

This is new match_pathspec(). I don't want to replace the old one
because it changes more places. But once it works, another patch to
kill match_pathspec() should be easy.

  grep: convert to use struct pathspec
  grep: use match_pathspec_depth() for cache grepping
  grep: use preallocated buffer for grep_tree()
  grep: drop pathspec_matches() in favor of tree_entry_interesting()

grep (especially t7810) is how I test all these. I need to write more
tests to make sure things work. But for now t7810 passes.

Hopefully I did not lose any optimizations in pathspec_matches().

It's time to rebase negative pathspec patches on top and get back to
my narrow clone.

[1] https://git.wiki.kernel.org/index.php/SoC2010Ideas#Unify_Pathspec_Semantics

 Documentation/glossary-content.txt |   23 ++++
 builtin/diff-files.c               |    2 +-
 builtin/diff.c                     |    4 +-
 builtin/grep.c                     |  200 ++++++++---------------------
 builtin/log.c                      |    2 +-
 cache.h                            |   14 ++
 diff-lib.c                         |    2 +-
 diff-no-index.c                    |   13 +-
 diff.h                             |    4 +-
 dir.c                              |   98 ++++++++++++++
 dir.h                              |    4 +
 read-cache.c                       |   20 +---
 revision.c                         |    6 +-
 t/t4010-diff-pathspec.sh           |   14 ++
 tree-diff.c                        |  246 ++++++++----------------------------
 tree-walk.c                        |  186 +++++++++++++++++++++++++++
 tree-walk.h                        |    2 +
 17 files changed, 461 insertions(+), 379 deletions(-)

-- 
1.7.3.3.476.g10a82

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]