On 25/02/19 02:23PM, Patrick Steinhardt wrote: > Most of the commands in git-update-ref(1) accept an old and/or new > object ID to update a specific reference to. These object IDs get parsed > via `repo_get_oid()`, which not only handles plain object IDs, but also > those that have a suffix like "~" or "^2". More surprisingly though, it > even knows to resolve references, despite the fact that its manpage does > not mention this fact even once. > > One consequence of this is that we also check for ambiguous references: > when parsing a full object ID where the DWIM mechanism would also cause > us to resolve it as a branch, we'd end up printing a warning. While this > check makes sense to have in general, it is arguably less useful in the > context of git-update-ref(1). This is out of two reasons: > > - The manpage is explicitly structured around object IDs. So if we see > a fully blown object ID, the intent should be quite clear in > general. Makes sense. > - The command is part of our plumbing layer and not a tool that users > would generally use in interactive workflows. As such, the warning > will likely not be visible to anybody in the first place. Ok, so in many cases already the warning is not propagated which makes its computation wasteful to begin with. > Furthermore, this check can be quite expensive when updating lots of > references via `--stdin`, because we try to read multiple references per > object ID that we parse according to the DWIM rules. This effect can be > seen both with the "files" and "reftable" backend. > > The issue is not unique to git-update-ref(1), but was also an issue in > git-cat-file(1), where it was addressed by disabling the ambiguity check > in 25fba78d36b (cat-file: disable object/refname ambiguity check for > batch mode, 2013-07-12). > > Disable the warning in git-update-ref(1), which provides a significant > speedup with both backends. The following benchmark creates 10000 new > references with a 100000 preexisting refs with the "files" backend: > > Benchmark 1: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) > Time (mean ± σ): 467.3 ms ± 5.1 ms [User: 100.0 ms, System: 365.1 ms] > Range (min … max): 461.9 ms … 479.3 ms 10 runs > > Benchmark 2: update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) > Time (mean ± σ): 394.1 ms ± 5.8 ms [User: 63.3 ms, System: 327.6 ms] > Range (min … max): 384.9 ms … 405.7 ms 10 runs > > Summary > update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD) ran > 1.19 ± 0.02 times faster than update-ref: create many refs (refformat = files, preexisting = 100000, new = 10000, revision = HEAD~) > > And with the "reftable" backend: > > Benchmark 1: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) > Time (mean ± σ): 146.9 ms ± 2.2 ms [User: 90.4 ms, System: 56.0 ms] > Range (min … max): 142.7 ms … 150.8 ms 19 runs > > Benchmark 2: update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) > Time (mean ± σ): 63.2 ms ± 1.1 ms [User: 41.0 ms, System: 21.8 ms] > Range (min … max): 61.1 ms … 66.6 ms 41 runs > > Summary > update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD) ran > 2.32 ± 0.05 times faster than update-ref: create many refs (refformat = reftable, preexisting = 100000, new = 10000, revision = HEAD~) > > Note that the absolute improvement with both backends is roughly in the > same ballpark, but the relative improvement for the "reftable" backend > is more significant because writing the new table to disk is faster in > the first place. > > Signed-off-by: Patrick Steinhardt <ps@xxxxxx> > --- > builtin/update-ref.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/builtin/update-ref.c b/builtin/update-ref.c > index 4d35bdc4b4b..d603f54b770 100644 > --- a/builtin/update-ref.c > +++ b/builtin/update-ref.c > @@ -179,7 +179,8 @@ static int parse_next_oid(const char **next, const char *end, > (*next)++; > *next = parse_arg(*next, &arg); > if (arg.len) { > - if (repo_get_oid(the_repository, arg.buf, oid)) > + if (repo_get_oid_with_flags(the_repository, arg.buf, oid, > + GET_OID_SKIP_AMBIGUITY_CHECK)) > goto invalid; > } else { > /* Without -z, an empty value means all zeros: */ > @@ -197,7 +198,8 @@ static int parse_next_oid(const char **next, const char *end, > *next += arg.len; > > if (arg.len) { > - if (repo_get_oid(the_repository, arg.buf, oid)) > + if (repo_get_oid_with_flags(the_repository, arg.buf, oid, > + GET_OID_SKIP_AMBIGUITY_CHECK)) > goto invalid; > } else if (flags & PARSE_SHA1_ALLOW_EMPTY) { > /* With -z, treat an empty value as all zeros: */ > @@ -772,7 +774,8 @@ int cmd_update_ref(int argc, > refname = argv[0]; > value = argv[1]; > oldval = argv[2]; > - if (repo_get_oid(the_repository, value, &oid)) > + if (repo_get_oid_with_flags(the_repository, value, &oid, > + GET_OID_SKIP_AMBIGUITY_CHECK)) > die("%s: not a valid SHA1", value); > } > > @@ -783,7 +786,8 @@ int cmd_update_ref(int argc, > * must not already exist: > */ > oidclr(&oldoid, the_repository->hash_algo); > - else if (repo_get_oid(the_repository, oldval, &oldoid)) > + else if (repo_get_oid_with_flags(the_repository, oldval, &oldoid, > + GET_OID_SKIP_AMBIGUITY_CHECK)) > die("%s: not a valid old SHA1", oldval); > } In builtin/update-ref.c all uses of repo_get_oid() have been converted to repo_get_oid_with_flags() with the GET_OID_SKIP_AMBIGUITY_CHECK flag except for one in parse_cmd_symref_update(). Is there reason to leave that one untouched? -Justin