Re: [PATCH v1] read_index_from(): Skip verification of the cache entry order to speed index loading

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 19, 2017 at 11:12:03AM -0400, Ben Peart wrote:

> > So, I think I like the direction of getting rid of the order
> > validation in post_read_index_from(), not only during the normal
> > operation but also in fsck.  I think it makes more sense to do so
> > incrementally inside do_read_index() all the time and see how fast
> > we can make it do so.
> 
> Unfortunately, all the cost is in the strcmp() - the other tests are
> negligible so moving it to be done incrementally inside do_read_index()
> won't reduce the cost, it just moves it and makes it harder to identify.

It's plausible that doing the strcmp() closer to where we are otherwise
manipulating the data may show an improvement due to memory cache
effects.

It should be easy enough to check that; the patch below implements it.
I couldn't measure any speedup with it running "git ls-files >/dev/null"
on linux.git (60k files). But nor could I get any by dropping the check
entirely.

This is mostly just a curiosity to me. For the record, I have no real
problem with dropping this kind of on-disk data-structure validation
when it's expensive. After all, we do not check the sort on pack .idx
files on each run (nor pack sha1 checksums, etc) because it's too
expensive to do so.

---
diff --git a/read-cache.c b/read-cache.c
index 65f4fe8375..ac8c8d2e93 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1664,25 +1664,19 @@ static struct cache_entry *create_from_disk(struct ondisk_cache_entry *ondisk,
 	return ce;
 }
 
-static void check_ce_order(struct index_state *istate)
-{
-	unsigned int i;
-
-	for (i = 1; i < istate->cache_nr; i++) {
-		struct cache_entry *ce = istate->cache[i - 1];
-		struct cache_entry *next_ce = istate->cache[i];
-		int name_compare = strcmp(ce->name, next_ce->name);
-
-		if (0 < name_compare)
-			die("unordered stage entries in index");
-		if (!name_compare) {
-			if (!ce_stage(ce))
-				die("multiple stage entries for merged file '%s'",
-				    ce->name);
-			if (ce_stage(ce) > ce_stage(next_ce))
-				die("unordered stage entries for '%s'",
-				    ce->name);
-		}
+static void check_ce_order(struct cache_entry *ce, struct cache_entry *next_ce)
+{
+	int name_compare = strcmp(ce->name, next_ce->name);
+
+	if (0 < name_compare)
+		die("unordered stage entries in index");
+	if (!name_compare) {
+		if (!ce_stage(ce))
+			die("multiple stage entries for merged file '%s'",
+			    ce->name);
+		if (ce_stage(ce) > ce_stage(next_ce))
+			die("unordered stage entries for '%s'",
+			    ce->name);
 	}
 }
 
@@ -1720,7 +1714,6 @@ static void tweak_split_index(struct index_state *istate)
 
 static void post_read_index_from(struct index_state *istate)
 {
-	check_ce_order(istate);
 	tweak_untracked_cache(istate);
 	tweak_split_index(istate);
 }
@@ -1784,6 +1777,8 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist)
 
 		disk_ce = (struct ondisk_cache_entry *)((char *)mmap + src_offset);
 		ce = create_from_disk(disk_ce, &consumed, previous_name);
+		if (i > 0)
+			check_ce_order(istate->cache[i - 1], ce);
 		set_index_entry(istate, i, ce);
 
 		src_offset += consumed;



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux