Re: [PATCH 8/8] cache-tree: avoid path comparison loop when silent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 30, 2020 at 11:27 AM Derrick Stolee via GitGitGadget
<gitgitgadget@xxxxxxxxx> wrote:
>
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> The verify_cache() method is used both for debugging issues with the
> cached tree extension but also to avoid using the extension when there
> are unmerged entries. It also checks for cache entries out of order with
> respect to file-versus-directory sorting.
>
> In 996277c (Refactor cache_tree_update idiom from commit, 2011-12-06),
> the silent option was added to remove the progress indicators from the
> initial loop looking for unmerged entries. This was changed to be a flag
> in e859c69 (cache-tree: update API to take abitrary flags, 2012-01-16).
>
> In both cases, the silent option is ignored for the second loop that
> checks for file-versus-directory sorting. It must be that this loop is
> intended only for debugging purposes and is not actually helpful in
> practice.

Looking through that code, I come to the same conclusion, though it
might be nice to have Junio confirm (and to explain the "if (10 <
++funny)" section; did that help debugging too?).   The second part of
the loop was part of his initial commit adding the cache-tree
extension in commit 749864627c ("Add cache-tree.", 2006-04-23)

> Since verify_cache() is called in cache_tree_update(), which is run
> during 'git commit', we could speed up 'git commit' operations by not
> iterating through this loop another time. Of course, noticing this loop
> requires an incredibly large index.
>
> To get a measurable difference, I needed to run this test on the
> Microsoft Office monorepo, which has over three million entries in its
> index. I used 'git commit --amend --no-edit' as my command and saw the
> performance go from 752ms to 739ms with this change.

Nice catch; I always appreciate when we can speed up a section of code
by just not running it.  :-)

>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  cache-tree.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/cache-tree.c b/cache-tree.c
> index f135bb77af5..c6c084463bd 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -175,10 +175,15 @@ static int verify_cache(struct cache_entry **cache,
>         if (funny)
>                 return -1;
>
> -       /* Also verify that the cache does not have path and path/file
> +       /*
> +        * Also verify that the cache does not have path and path/file
>          * at the same time.  At this point we know the cache has only
> -        * stage 0 entries.
> +        * stage 0 entries. In silent mode, we do not want these messages,
> +        * and they should not exist unless a bug was introduced along
> +        * the way.
>          */
> +       if (silent)
> +               return 0;
>         funny = 0;
>         for (i = 0; i < entries - 1; i++) {
>                 /* path/file always comes after path because of the way
> --
> gitgitgadget

Change seems simple enough.  Thanks for adding the comment explaining
the early return.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux