Re: [PATCH] fix simple deepening of a repo

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Shawn O. Pearce" <spearce@xxxxxxxxxxx> writes:

> We aren't quite at the 50k ref stage yet, but we're starting to
> consider that some of our repositories have a ton of refs, and
> that the initial advertisement for either fetch or push is horrid.
>
> Since the refs are immutable I could actually teach the JGit
> daemon to hide them from JGit's receive-pack, thus cutting down the
> advertisement on push, but the refs exist so you can literally say:

What do you mean "refs are immutable"?

Do you mean "In the particular application, Gerrit, the server knows that
certain refs will never move nor get deleted, once they are created"?  If
so, then I would understand, but otherwise what you are describing is not
git anymore ;-)

And I think it is probably worth thinking things through to find a way to
take advantage of that knowledge.

Even though refs under refs/changes/ hierarchy may have that property, the
client won't know what's available unless you advertise it in some way.

You could assume some offline measure outside the git protocol exists for
clients to find out about them, and protocol extension could say "I do not
want to hear about refs that match these globs during this exchange,
because I have learnt about them offline", and the server could skip
advertisement.

>   git fetch --uploadpack='git upload-pack --ref refs/changes/88/4488/2' URL refs/changes/88/4488/2
>
> Personally I'd prefer extending the protocol, because making the
> end user supply information twice is stupid.

In the upload-pack protocol, the server talks first, so it is rather hard
to shoehorn a request from a client to ask "I know about refs/changes/*
hiearchy, so don't talk about them".

I however think it is entirely reasonable to have a server side
configuration that tells upload-pack not to advertise refs/changes/*
hierarchy but still remembers they are OUR_REF.  In send_ref() in
upload-pack.c, you'd do something like (I know, I know, you'd be doing
an equivalent of this in jgit):

	static const char *capabilities = "multi_ack ...";
	struct object *o = parse_object(sha1);
	int skip_advertisement = exclude_ref_from_advertisement(refname);

	if (!o)
		die("git upload-pack: cannot find object %s:", sha1_to_hex(sha1));

	if (!skip_advertisement) {
		if (capabilities)
			packet_write(1, "%s %s%c%s\n", sha1_to_hex(sha1), refname,
				0, capabilities);
		else
			packet_write(1, "%s %s\n", sha1_to_hex(sha1), refname);
		capabilities = NULL;
	}

	if (!(o->flags & OUR_REF)) {
		o->flags |= OUR_REF;
		nr_our_refs++;
	}
	if (o->type == OBJ_TAG) {
		o = deref_tag(o, refname, 0);
		if (o && !skip_advertisement)
			packet_write(1, "%s %s^{}\n", sha1_to_hex(o->sha1), refname);
	}
	return 0;

Doing it this way, receive_needs() will allow refs/changes/88/4488/2 to be
requested, because that is what send_ref() saw and marked as OUR_REF.  It
was just not sent to the client.  And get_common_commits() will behave the
same with or without this abbreviated advertisement,

Of course, the client side cannot grab everything with refs/*:refs/remotes/*
wildcard refspecs from such a server, but I think that can be considered a
feature.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]