Re: [PATCH] Do not fetch tags on new shallow clones

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



2012/1/4 Nguyễn Thái Ngọc Duy <pclouds@xxxxxxxxx>:
> The main purpose of shallow clones is to reduce download. Fetching
> tags likely defeats this purpose because old-enough repos tend to have
> a lot of tags, spreading across history, which may increase the number
> of objects to download significantly.

Thank you for looking at this. I complained about it to Junio many
weeks ago, but never took the time myself to fix it. :-)

>  We should also fetch a single branch, but because branches are
>  usually less crowded and stay close the tip, they do not produce too
>  many extra objects. Let's leave it until somebody yells up.

Depends on the project. In git.git maint stays relatively close to
master, but its still not really that close. In other projects, there
are certainly huge differences between two active branches, sometimes
spanning years. Consider any product with a multiple year support
contract on an older version, where the support contract demands
patches for the older version to fix bugs.  :-)

I agree this can be looked at later with a different change, but there
should be a way to specify exactly which branches you want to clone,
especially in the shallow case.

>  We should also fetch tags that reference to downloaded objects. But I
>  don't know how fetch does that magic,

If the remote advertises the capability "include-tag", and the client
wants tags, it asks for that include-tag capability in its request.
This is handled by the fetch_pack args field include_tag being set to
1. When the remote side sees the client requesting include-tag and it
packs the thing a tag points at, the tag is also packed, even though
it wasn't explicitly requested by the client.

> so for now users have to do
>  "git fetch" after cloning for tags. I have only gone as far as
>  fetching tags along by setting TRANS_OPT_FOLLOWTAGS? Help?

Right. Set TRANS_OPT_FOLLOWTAGS in the transport structure to fetch
only tags that are pointing at things already being sent. The delta
increase in transfer is 1 object (the tag) and whatever that tag takes
up on disk.

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 86db954..abd8578 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -428,7 +428,7 @@ static struct ref *wanted_peer_refs(const struct ref *refs,
>        struct ref **tail = head ? &head->next : &local_refs;
>
>        get_fetch_map(refs, refspec, &tail, 0);
> -       if (!option_mirror)
> +       if (!option_mirror && !option_depth)
>                get_fetch_map(refs, tag_refspec, &tail, 0);
>
>        return local_refs;

I think if you just add this into your patch, you get the auto follow
tag feature enabled:

diff --git a/builtin/clone.c b/builtin/clone.c
index efe8b6c..ecaafdb 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -641,6 +641,7 @@ int cmd_clone(int argc, const char **argv, const char *prefi
                        die(_("Don't know how to clone %s"), transport->url);

                transport_set_option(transport, TRANS_OPT_KEEP, "yes");
+               transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1");

                if (option_depth)
                        transport_set_option(transport, TRANS_OPT_DEPTH,

totally untested (didn't even compile). This only works for the
"remote" cases where the native Git protocol is used. A local clone
using the hardlink or copy objects path, or a dumb HTTP or rsync clone
will ignore the option and not supply you the tags.

Annnddddd..... it doesn't appear to work.

You need to copy a block of code from fetch. The problem is the object
was copied locally by the transport, but the transport doesn't tell
you what extra objects came along. Clone has to loop back through the
advertised reference map from the transport, checking each tag to see
if has_sha1_file() says the object exists locally. If it does, then
clone needs to add that reference update to the set of things it will
store (and print to the terminal).

I think this loop is the find_non_local_tags() in builtin/fetch.c. Its
been a long time since I hacked on this code. The JGit version of
looking for these extra objects post transfer is more clearly
documented. *sigh*
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]