Re: [PATCH v2 2/3] receive-pack: skip connectivity checks on delete-only commands

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 28, 2021 at 07:33:11AM +0200, Patrick Steinhardt wrote:

> Fix this by not doing a connectivity check in case there were no pushed
> objects. Given that git-rev-walk(1) with only negative references will
> not do any graph walk, no performance improvements are to be expected.
> Conceptionally, it is still the right thing to do though.

Even though it's not producing any exciting results, I agree this is
still a reasonable thing to do.

I'm actually surprised it didn't help in your many-ref cases, just
because I think the traversal machinery is pretty eager to parse tags
and commits which are fed as tips.

If I run "git rev-list --not --all --stdin </dev/null" in linux.git, it
takes about 35ms. But if I make a ton of refs, like:

  git rev-list HEAD |
  perl -lpe 's{.*}{update refs/foo/commit$. $&}' |
  git update-ref --stdin
  git pack-refs --all --prune

then it takes about 2000ms (if you don't pack it's even worse, as you
might expect).

So how come we don't see that improvement in your "extrarefs" cases?
Looking at patch 1, they also seem to make one ref for every commit.

I think the answer may be below...

> +	/*
> +	 * If received commands only consist of deletions, then the client MUST
> +	 * NOT send a packfile because there cannot be any new objects in the
> +	 * first place. As a result, we do not set up a quarantine environment
> +	 * because we know no new objects will be received. And that in turn
> +	 * means that we can skip connectivity checks here.
> +	 */
> +	if (tmp_objdir) {

I think this will work, but we're now making assumptions about how
tmp_objdir will be initialized by the rest of the code.

Could we make a more direct check of: skip calling rev-list if we have
no positive tips to feed? If we call iterate_receive_command() and it
returns end-of-list on the first call, then we know there is nothing to
feed (and as a bonus, this catches some more noop cases around shallow
repos; see iterate_receive_command).

But that made me think that check_connected() could be doing this
itself. And indeed, it seems to already. At the top of that function is:

          if (fn(cb_data, &oid)) {
                  if (opt->err_fd)
                          close(opt->err_fd);
                  return err;
          }

So before we even spawn rev-list, if there's nothing to feed it, we'll
return right away ("err" will be "0" at this point). And I think that
has been there since the beginning of the function (it is even in the
old versions in builtin/fetch.c before it was factored out).

And that explains why you didn't see any performance improvement. We're
already doing this optimization. :)

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux