Re: [PATCH v1] add: speed up cmd_add() by utilizing read_cache_preload()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 2, 2018 at 2:32 PM Ben Peart <peartben@xxxxxxxxx> wrote:
>
> From: Ben Peart <benpeart@xxxxxxxxxxxxx>
>
> During an "add", a call is made to run_diff_files() which calls
> check_remove() for each index-entry.  The preload_index() code distributes
> some of the costs across multiple threads.

Instead of doing this site by site. How about we make read_cache()
always do multithread preload?

The only downside I see is preload may actually harm when there are
too few cache entries (but more than 500), but this needs to be
verified. If the penalty is small enough, I think we could live with
it since everything is fast when you have few entries anyway.

But if that's not true, we could add a threshold to activate preload.
Something like "if you have 50k files or more, then activate preload"
would do. I notice THREAD_COST in preload code, but I don't think it's
the same thing.

>
> Because the files checked are restricted to pathspec, adding individual
> files makes no measurable impact but on a Windows repo with ~200K files,
> 'git add .' drops from 6.3 seconds to 3.3 seconds for a 47% savings.
>
> Signed-off-by: Ben Peart <benpeart@xxxxxxxxxxxxx>
> ---
>
> Notes:
>     Base Ref: master
>     Web-Diff: https://github.com/benpeart/git/commit/fc4830b545
>     Checkout: git fetch https://github.com/benpeart/git add-preload-index-v1 && git checkout fc4830b545
>
>  builtin/add.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index ad49806ebf..f65c172299 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -445,11 +445,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>                 return 0;
>         }
>
> -       if (read_cache() < 0)
> -               die(_("index file corrupt"));
> -
> -       die_in_unpopulated_submodule(&the_index, prefix);
> -
>         /*
>          * Check the "pathspec '%s' did not match any files" block
>          * below before enabling new magic.
> @@ -459,6 +454,10 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>                        PATHSPEC_SYMLINK_LEADING_PATH,
>                        prefix, argv);
>
> +       if (read_cache_preload(&pathspec) < 0)
> +               die(_("index file corrupt"));
> +
> +       die_in_unpopulated_submodule(&the_index, prefix);
>         die_path_inside_submodule(&the_index, &pathspec);
>
>         if (add_new_files) {
>
> base-commit: 4ede3d42dfb57f9a41ac96a1f216c62eb7566cc2
> --
> 2.18.0.windows.1
>


-- 
Duy



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux