Re: [PATCH v2 7/9] bundle-uri: allow relative URLs in bundle lists

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/28/2022 8:25 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:

>> +	if (!list->baseURI) {
>> +		struct strbuf baseURI = STRBUF_INIT;
>> +		strbuf_addstr(&baseURI, uri);
>> +
>> +		/*
>> +		 * If the URI does not end with a trailing slash, then
>> +		 * remove the filename portion of the path. This is
>> +		 * important for relative URIs.
>> +		 */
>> +		strbuf_strip_file_from_path(&baseURI);
>> +		list->baseURI = strbuf_detach(&baseURI, NULL);
>
> Is the 'baseURI' is set to the URI of the first bundle (ordered by hash)? If
> data is distributed across multiple CDNs, couldn't this be a suboptimal
> choice? For example, if the first bundle is on 'A.com', but every other
> bundle is on 'B.org'?

The baseURI is set to one of two things:

1. The URI used for the clone, specifying the way the client connected to
   the Git server, or

2. The URI used to download the bundle list itself.

This allows the same bundle list file to be distributed to multiple CDNs,
assuming that the bundles themselves will have the same relative position
to the list.

>> +	/**
>> +	 * The baseURI of a bundle_list is used as the base for any
>> +	 * relative URIs advertised by the bundle list at that location.
>> +	 *
>> +	 * When the list is generated from a Git server, then use that
>> +	 * server's location.
>
> Hmmm, I think I'm missing something with my earlier comment. I thought the
> 'uri' argument to 'bundle_uri_parse_config_format()' was an individual
> bundle's URI? What's the "server's location" in this context?

I can work to make this concept clearer by rewording this comment.

>> @@ -40,6 +40,8 @@ static int cmd__bundle_uri_parse(int argc, const char **argv, enum input_mode mo
>>
>>  	init_bundle_list(&list);
>>
>> +	list.baseURI = xstrdup("<uri>");
>
> Using a hardcoded value here leads to pretty different behavior in
> 'test-bundle-uri.c' vs. starting with an unset 'list.baseURI' in something
> like 'clone'. Why does this need to be set to '<uri>' for the tests?

In this part of the test helper, we are not making a connection to a server
and instead parsing a bundle list file directly. To demonstrate how the
relative paths work during this parsing, we add a bogus baseURI here so
we can clearly see where the relative paths were parsed versus using the
URI as an absolute URI.


>> +test_expect_success 'bundle_uri_parse_line(): relative URIs' '
>> +	cat >in <<-\EOF &&
>> +	bundle.one.uri=bundle.bdl
>> +	bundle.two.uri=../bundle.bdl
>> +	bundle.three.uri=sub/dir/bundle.bdl
>> +	EOF
>> +
>> +	cat >expect <<-\EOF &&
>> +	[bundle]
>> +		version = 1
>> +		mode = all
>> +	[bundle "one"]
>> +		uri = <uri>/bundle.bdl
>> +	[bundle "two"]
>> +		uri = bundle.bdl
>
> This seems a little strange, but it looks like '<uri>/../bundle.bdl'
> normalizes to 'bundle.bdl' because '<uri>' is treated like a regular path
> element (like a directory).
>
> Out of curiosity, what would happen if 'bundle.two.uri' was
> '../../bundle.bdl'?

It will fail! The error message is

	"fatal: cannot strip one component off url '.'"

This is disappointing that an erroneous bundle list could cause a 'git
clone' command to die(), when we want the bundle URI feature to allow the
clone to continue normally even if the bundle downloads fail. I will mark
this for #leftoverbits, since it would involve changing the interface for
chop_last_dir() and relative_url() in remote.c.

At minimum, I will document this with a test case.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux