Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 7/24/2018 2:42 PM, Eric Sunshine wrote:
On Tue, Jul 24, 2018 at 2:01 PM Ben Peart <Ben.Peart@xxxxxxxxxxxxx> wrote:
If the new core.optimizecheckout config setting is set to true, speed up

Maybe:

     Add core.optimizeCheckout config setting which, when true, speeds up


Sure

"git checkout -b foo" by avoiding the work to merge the working tree.  This
is valid because no merge needs to occur - only creating the new branch/
updating the refs. Any other options force it through the old code path.

This change in behavior is off by default and behind the config setting so
that users have to opt-in to the optimized behavior.

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before because it
implicitly changes the behavior of checkout -b. Trying it again configured
behind a config setting as a potential solution for other optimizations to
checkout that could change the behavior as well.

This paragraph is mere commentary which probably belongs below the
"---" line following your sign-off.


Hopefully this commentary (I'll move it below the --- line) is clearer:

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before [1] because it
implicitly changes the behavior of checkout -b as it no longer updates
the working directory.

I'm submitting it again behind a config setting so that it doesn't cause
any back compat issues unless the user explicitly opts in to the new
behavior. My hope is this same setting and model can be used if/when we
make other performance optimizations to checkout like using the cache
tree to avoid having to traverse the entire tree being discussed [2].

[1] https://public-inbox.org/git/20160909192520.4812-1-benpeart@xxxxxxxxxxxxx/ [2] https://public-inbox.org/git/20180724042740.GB13248@xxxxxxxxxxxxxxxxxxxxx/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

https://public-inbox.org/git/20180724042740.GB13248@xxxxxxxxxxxxxxxxxxxxx/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

Is this link meant to reference the previous attempt of optimizing
"checkout -b"? Although there's a single mention of "checkout -b" in
that discussion, it doesn't seem to be the previous attempt or explain
why it was rejected.

It would be quite nice to see a discussion in both the commit message
and the documentation about the pros and cons of enabling this
optimization. That it was previously rejected suggests that there may
be serious or unexpected consequences. How will a typical user know
whether its use is desirable or not?

Signed-off-by: Ben Peart <Ben.Peart@xxxxxxxxxxxxx>
---
diff --git a/Documentation/config.txt b/Documentation/config.txt
@@ -911,6 +911,12 @@ core.commitGraph::
+core.optimizedCheckout
+       Speed up "git checkout -b foo" by skipping much of the work of a
+       full checkout command.  This changs the behavior as it will skip

s/changs/changes/

+       merging the trees and updating the index and instead only create
+       and switch to the new ref.
diff --git a/builtin/checkout.c b/builtin/checkout.c
@@ -471,6 +475,88 @@ static void setup_branch_path(struct branch_info *branch)
+static int needs_working_tree_merge(const struct checkout_opts *opts,
+       const struct branch_info *old_branch_info,
+       const struct branch_info *new_branch_info)
+{
+       /*
+        * We must do the merge if we are actually moving to a new
+        * commit tree.
+        */
+       if (!old_branch_info->commit || !new_branch_info->commit ||
+               oidcmp(&old_branch_info->commit->object.oid, &new_branch_info->commit->object.oid))
+               return 1;
+       [...]
+       return 0;
+}

This long list of special-case checks doesn't leave me too enthused,
however, that aside, this approach seems backward. Rather than erring
on the side of safety by falling back to the merging behavior, it errs
in the other direction, which may be a problem if this list of
special-case checks ever gets out of sync with 'checkout_opts'. That
is, if someone adds a new option which ought to employ the merging
behavior, but forgets to update this function, then this function will
incorrectly default to using the optimization.


I'm not thrilled with the long list either (the plethora of comments probably makes it appear worse than it is) but I don't see how flipping the logic around makes it fail if someone adds a new option. The "meets all criteria for optimization" code can only test existing options.

A safer approach would be the inverse, namely:

     static int skip_worktree_merge(...)
     {
         if (...meets all criteria for optimization...)
             return 1;
         return 0;
     }

  static int merge_working_tree(const struct checkout_opts *opts,
                               struct branch_info *old_branch_info,
                               struct branch_info *new_branch_info,
{
+       /*
+        * Skip merging the trees, updating the index, and work tree only if we
+        * are simply creating a new branch via "git checkout -b foo."  Any
+        * other options or usage will continue to do all these steps.
+        */
+       if (core_optimize_checkout && !needs_working_tree_merge(opts, old_branch_info, new_branch_info))
+               return 0;

This seems a somewhat odd place to hook in this optimization,
especially as there is only a single caller of this function. Instead,
one might expect the caller itself to make this judgment and avoid
trying the merge in the first place if not needed. That is, in
switch_branches:

     if (!skip_worktree_merge(...))
         ret = merge_working_tree(...);


I personally agree, it was moved to its current location per feedback the first time around. Perhaps with the addition of the config setting it will be better received moved out to the caller.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux