Re: [PATCH 8/8] cache-tree: avoid path comparison loop when silent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 30.12.20 um 20:26 schrieb Derrick Stolee via GitGitGadget:
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> The verify_cache() method is used both for debugging issues with the
> cached tree extension but also to avoid using the extension when there
> are unmerged entries. It also checks for cache entries out of order with
> respect to file-versus-directory sorting.
>
> In 996277c (Refactor cache_tree_update idiom from commit, 2011-12-06),
> the silent option was added to remove the progress indicators from the
> initial loop looking for unmerged entries. This was changed to be a flag
> in e859c69 (cache-tree: update API to take abitrary flags, 2012-01-16).
>
> In both cases, the silent option is ignored for the second loop that
> checks for file-versus-directory sorting. It must be that this loop is
> intended only for debugging purposes and is not actually helpful in
> practice.

So you're saying that the directory/file conflict check not honoring the
WRITE_TREE_SILENT flag would have been noticed as a bug and therefore
doesn't happen?

I'm not sure I can follow that logic.  I don't know how important that
check is, how often it found a conflict and how likely such a find is
overlooked or ignored, but disabling a check in silent mode that
affects the return code instead of only suppressing its messages seems
risky.

If we are sure that the check cannot trigger then we should remove it.
If we are not so sure, but a conflict would be Git's fault (and not the
user's) then we should always do the check and BUG out.  And otherwise
we should keep it.

> Since verify_cache() is called in cache_tree_update(), which is run
> during 'git commit', we could speed up 'git commit' operations by not
> iterating through this loop another time. Of course, noticing this loop
> requires an incredibly large index.
>
> To get a measurable difference, I needed to run this test on the
> Microsoft Office monorepo, which has over three million entries in its
> index. I used 'git commit --amend --no-edit' as my command and saw the
> performance go from 752ms to 739ms with this change.

Could you please check the performance impact of the following code
simplification?

diff --git a/cache-tree.c b/cache-tree.c
index a537a806c1..1105cfe6b7 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -187,10 +187,8 @@ static int verify_cache(struct cache_entry **cache,
 		 */
 		const char *this_name = cache[i]->name;
 		const char *next_name = cache[i+1]->name;
-		int this_len = strlen(this_name);
-		if (this_len < strlen(next_name) &&
-		    strncmp(this_name, next_name, this_len) == 0 &&
-		    next_name[this_len] == '/') {
+		const char *p;
+		if (skip_prefix(next_name, this_name, &p) && *p == '/') {
 			if (10 < ++funny) {
 				fprintf(stderr, "...\n");
 				break;




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux