Re: [PATCH/RFC] archive: allow archiving of reachable sha1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nicolas Cornu <nicolac76@xxxxxxxxx> writes:

> Remotely specify a tree-ish by a sha1 is now valid even if
> uploadarchive.allowunreachable is false only if this sha1 is reachable
> from a branch or a tag reference. We consider those last one to be
> public.
>
> Signed-off-by: Nicolas Cornu <nicolac76@xxxxxxxx>
> ---
> Do you think this patch is too much "computationnally expensive"?

Yes.

> Maybe we need an option to disable such a a feature.

No, you need an option to enable it while keeping it disabled by
default.

> +static int is_reachable(const char *refname, const struct object_id *oid, int flags, void *cb_data)
> +{
> +	const unsigned char *sha1 = (unsigned char *)cb_data;
> +	return in_merge_bases(lookup_commit(sha1), lookup_commit(oid->hash));
> +}

So this checks if the tip of a ref (i.e. lookup_commit(oid->hash))
can reach the commit in cb_data, i.e. what the caller had in
oid.hash below.

>  static void parse_treeish_arg(const char **argv,
>  		struct archiver_args *ar_args, const char *prefix,
>  		int remote)
> @@ -364,8 +370,13 @@ static void parse_treeish_arg(const char **argv,
>  		const char *colon = strchrnul(name, ':');
>  		int refnamelen = colon - name;
>  
> -		if (!dwim_ref(name, refnamelen, oid.hash, &ref))
> -			die("no such ref: %.*s", refnamelen, name);
> +		if (!dwim_ref(name, refnamelen, oid.hash, &ref)) {
> +			if (get_sha1(name, oid.hash))
> +				die("Not a valid object name");
> +			if (!for_each_branch_ref(&is_reachable, oid.hash) &&
> +			    !for_each_tag_ref(&is_reachable, oid.hash))
> +				die("no such ref: %.*s", refnamelen, name);
> +		}
>  		free(ref);
>  	}


> +		if (!dwim_ref(name, refnamelen, oid.hash, &ref)) {
> +			if (get_sha1(name, oid.hash))
> +				die("Not a valid object name");
> +			if (!for_each_branch_ref(&is_reachable, oid.hash) &&
> +			    !for_each_tag_ref(&is_reachable, oid.hash))
> +				die("no such ref: %.*s", refnamelen, name);

Isn't this making the check unnecessarily expensive by requiring the
traversal to determine the ancestry down to the same oid.hash to
happen repeatedly?  A repository may have quite a many tags, some
very old ones, and traversing down to these old tags from the
requested commit in order to prove that the commit cannot be reached
by these tags (hence these tags cannot be an excuse for you to
serve the commit) would be expensive.

I suspect that it would be better to collect the objects at the tips
in an array using for_each_*_ref(), and do the check by using a
single call to in_merge_bases_many() instead, which would require
only one traversal?  Granted, it would still dig down to the common
ancestor between the asked commit and these old tags, i.e. one
near-full walk of the history, but at least you won't be making a
multiple such near-full walks of the history.

I also suspect that in_merge_bases_many() could be further optimized
if it turns out to be necessary by teaching paint_down_to_common()
an option to stop early, but that would be a second step of the
optimization.

> +		}
>  		free(ref);
>  	}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]