On 4/8/2017 10:06 AM, René Scharfe wrote:
Am 07.04.2017 um 17:53 schrieb git@xxxxxxxxxxxxxxxxx:
From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
Teach traverse_trees_recursive() to not do redundant ODB
lookups when both directories refer to the same OID.
In operations such as read-tree, checkout, and merge when
the differences between the commits are relatively small,
there will likely be many directories that have the same
SHA-1. In these cases we can avoid hitting the ODB multiple
times for the same SHA-1.
This patch handles n=2 and n=3 cases and simply copies the
data rather than repeating the fill_tree_descriptor().
================
On the Windows repo (500K trees, 3.1M files, 450MB index),
this reduced the overall time by 0.75 seconds when cycling
between 2 commits with a single file difference.
(avg) before: 22.699
(avg) after: 21.955
===============
================
Using the p0004-read-tree test (posted earlier this week)
with 1M files on Linux:
before:
$ ./p0004-read-tree.sh
0004.5: switch work1 work2 (1003037) 11.99(8.12+3.32)
0004.6: switch commit aliases (1003037) 11.95(8.20+3.14)
after:
$ ./p0004-read-tree.sh
0004.5: switch work1 work2 (1003037) 11.02(7.71+2.78)
0004.6: switch commit aliases (1003037) 10.95(7.57+2.82)
================
Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx>
---
unpack-trees.c | 23 +++++++++++++++++++----
1 file changed, 19 insertions(+), 4 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index 3a8ee19..143c5d9 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -531,6 +531,11 @@ static int switch_cache_bottom(struct
traverse_info *info)
return ret;
}
+static inline int are_same_oid(struct name_entry *name_j, struct
name_entry *name_k)
+{
+ return name_j->oid && name_k->oid && !oidcmp(name_j->oid,
name_k->oid);
+}
+
static int traverse_trees_recursive(int n, unsigned long dirmask,
unsigned long df_conflicts,
struct name_entry *names,
@@ -554,10 +559,20 @@ static int traverse_trees_recursive(int n,
unsigned long dirmask,
newinfo.df_conflicts |= df_conflicts;
for (i = 0; i < n; i++, dirmask >>= 1) {
- const unsigned char *sha1 = NULL;
- if (dirmask & 1)
- sha1 = names[i].oid->hash;
- buf[i] = fill_tree_descriptor(t+i, sha1);
+ if (i > 0 && are_same_oid(&names[i], &names[i-1])) {
+ /* implicitly borrow buf[i-1] inside tree_desc[i] */
+ memcpy(&t[i], &t[i-1], sizeof(struct tree_desc));
An assignment would be simpler:
t[i] = t[i - 1];
True, but this might be a coin toss. Maybe we should
see what the generated assembly looks like for each ??
+ buf[i] = NULL;
+ } else if (i > 1 && are_same_oid(&names[i], &names[i-2])) {
+ /* implicitly borrow buf[i-2] inside tree_desc[i] */
+ memcpy(&t[i], &t[i-2], sizeof(struct tree_desc));
Similar case.
+ buf[i] = NULL;
What makes the previous two entries special, or differently put: Why not
just check *all* previous entries? MAX_UNPACK_TREES is 8; the number of
comparisons would just about double in the worst case and stay the same
for three trees or less. The order of four trees or more wouldn't
matter anymore.
I tried to hit the common cases. This loop runs a lot
and I didn't want to put an O(n^2) thing in there to
look for any matching peer. Most of the time we are
in a simple 2 or 3 way effort. I didn't want to pay
for the looping/branching overhead for the obscure [4..8]
efforts.
+ } else {
+ const unsigned char *sha1 = NULL;
+ if (dirmask & 1)
+ sha1 = names[i].oid->hash;
+ buf[i] = fill_tree_descriptor(t+i, sha1);
+ }
}
bottom = switch_cache_bottom(&newinfo);