[PATCH 0/7] Minor bug fix and optimizations for revision/tree walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



While working on sparse clones[1], I discovered a minor bug in the
tree-walking machinery for rev-list when both --objects and paths are
specified.  In particular, such a combination of options would
(correctly) result in the paths being used to select relevant commits,
but (incorrectly) being ignored when walking the subtrees of those
commits.  While I was at it, I also cleaned a few things up and
provided some small optimizations.  Passes all the tests for me.

NOTE: The last two patches are here mostly to see if anyone knows of
any real uses of git with ginormous trees that are really deep and
have few entries per tree.  These two patches might help with such a
case, but otherwise they seem to make the code a bit uglier and don't
really help performance-wise (and I think may even hurt a little bit).

Performance-wise, ignoring the last two patches, I get the following
approximate speedups:
  (A -  0%) git rev-list --quiet HEAD
  (B -  4%) git rev-list --quiet HEAD -- Documentation/
  (C -  3%) git rev-list --quiet HEAD -- t/
  (D -  1%) git rev-list --objects HEAD > /dev/null
  (E - 66%) git rev-list --objects HEAD -- Documentation/ > /dev/null

Complete timings (in seconds) on my laptop (core 2 duo?):
               A      B      C      D      E
  maint       0.35   0.69   1.35   1.92   1.40
  Patch-1     0.35   0.70   1.35   1.92   1.40
  Patch-2     0.34   0.69   1.35   1.90   0.85
  Patch-3     0.35   0.69   1.35   1.90   0.84
  Patch-4     0.34   0.70   1.35   1.90   0.85
  Patch-5     0.35   0.66   1.31   1.90   0.84
  Patch-6     0.35   0.66   1.31   1.92   0.82
  Patch-7     0.35   0.66   1.31   1.91   0.81

Note that for each case, I ran 6 times and averaged the last 5 runs.
I've rerun the cases a few times to regenerate the above table; the
numbers seem to vary about +/- 0.01 seconds between runs so the
speedups are slightly noisy given the small values, but they are
consistently positive for me.  I sometimes see the last two patches
have a little more negative impact, though it's pretty close to noise.


Elijah Newren (7):
  Add testcase showing how pathspecs are ignored with rev-list
    --objects
  Fix ignoring of pathspecs with rev-list --objects
  tree-walk: Correct bitrotted comment about tree_entry()
  tree_entry_interesting(): Make return value more specific
  diff_tree(): Skip skip_uninteresting() when all remaining paths
    interesting
  list-objects.c: Avoid recomputing interesting-ness for subtrees when
    possible
  tree-diff.c: Avoid recomputing interesting-ness for subtrees when
    possible

 diff.h                   |    1 +
 list-objects.c           |   27 ++++++++++++++++++---
 t/t6000-rev-list-misc.sh |   23 ++++++++++++++++++
 tree-diff.c              |   58 +++++++++++++++++++++++-----------------------
 tree-walk.h              |    4 ++-
 5 files changed, 79 insertions(+), 34 deletions(-)
 create mode 100755 t/t6000-rev-list-misc.sh

[1] It looks like Nguyễn Thái Ngọc Duy is much further along than I am
on such work.  Perhaps the only point in the work I was doing was to
enable me to help review a few of his patches (which I will try to
find some time to do), though perhaps the different route I have taken
will end up helping some.  It's fun and educational either way.

-- 
1.7.2.2.39.gf7e23

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]