[PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We do n-way merge by walking the source index and n trees at the same
time and add merge results to a new temporary index called o->result.
The merge result for any given path could be either

- keep_entry(): same old index entry in o->src_index is reused
- merged_entry(): either a new entry is added, or an existing one updated
- deleted_entry(): one entry from o->src_index is removed

For some reason [1] we keep making sure that the source index's
cache-tree is still valid if used by o->result: for all those
merged/deleted entries, we invalidate the same path in o->src_index,
so only cache-trees covering the "keep_entry" parts remain good.

Because of this, the cache-tree from o->src_index can be perfectly
reused in o->result. And in fact we already rely on this logic to
reuse untracked cache in edf3b90553 (unpack-trees: preserve index
extensions - 2017-05-08). Move the cache-tree to o->result before
doing cache_tree_update() to reduce hashing cost.

Since cache_tree_update() has risen up as one of the most expensive
parts in unpack_trees() after the last few patches. This does help
reduce unpack_trees() time significantly (on webkit.git):

    before       after
  --------------------------------------------------------------------
    0.080394752  0.051258167 s:  read cache .git/index
    0.216010838  0.212106298 s:  preload index
    0.008534301  0.280521764 s:  refresh index
    0.251992198  0.218160442 s:   traverse_trees
    0.377031383  0.374948191 s:   check_updates
    0.372768105  0.037040114 s:   cache_tree_update
    1.045887251  0.672031609 s:  unpack_trees
    0.314983512  0.317456290 s:  write index, changed mask = 2e
    0.062572653  0.038382654 s:    traverse_trees
    0.000022544  0.000042731 s:    check_updates
    0.073795585  0.050930053 s:   unpack_trees
    0.073807557  0.051099735 s:  diff-index
    1.938191592  1.614241153 s: git command: git checkout -

[1] I'm pretty sure the reason is an oversight in 34110cd4e3 (Make
    'unpack_trees()' have a separate source and destination index -
    2008-03-06). That patch aims to _not_ update the source index at
    all. The invalidation should have been done on o->result in that
    patch. But then there was no cache-tree on o->result even then so
    it's pointless to do so.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>
---
 read-cache.c   | 2 ++
 unpack-trees.c | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/read-cache.c b/read-cache.c
index 4fd35f4f37..2b5646ef26 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -2763,4 +2763,6 @@ void move_index_extensions(struct index_state *dst, struct index_state *src)
 {
 	dst->untracked = src->untracked;
 	src->untracked = NULL;
+	dst->cache_tree = src->cache_tree;
+	src->cache_tree = NULL;
 }
diff --git a/unpack-trees.c b/unpack-trees.c
index 6deb04c163..d822662c75 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1570,6 +1570,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index) {
+		move_index_extensions(&o->result, o->src_index);
 		if (!ret) {
 			if (!o->result.cache_tree)
 				o->result.cache_tree = cache_tree();
@@ -1578,7 +1579,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 						  WRITE_TREE_SILENT |
 						  WRITE_TREE_REPAIR);
 		}
-		move_index_extensions(&o->result, o->src_index);
 		discard_index(o->dst_index);
 		*o->dst_index = o->result;
 	} else {
-- 
2.18.0.1004.g6639190530




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux