Re: [PATCH 0/5] Bundle URIs II: git clone --bundle-uri

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2022.07.25 20:34, Derrick Stolee via GitGitGadget wrote:
> This is the second series building the bundle URI feature as discussed in
> the previous series that added the design document [1]. This series does not
> modify the design document, so the patches are independent and can be
> applied to the latest 'master'.
> 
> [1]
> https://lore.kernel.org/git/pull.1248.v3.git.1658757188.gitgitgadget@xxxxxxxxx
> 
> This series brings in just enough logic that we can bootstrap clones from a
> single bundle using git clone --bundle-uri=<X>.
> 
>  * Patch 1 adds a 'get' capability to 'git remote-https' which allows
>    downloading the contents of a URI to a local file.
>  * Patch 2 creates basic file-copy logic within a new bundle-uri.c file. It
>    is not used until patch 3.
>  * Patch 3 creates the git clone --bundle-uri=<X> option, allowing Git to
>    bootstrap a clone from a bundle, but get the remaining objects from the
>    origin URL. (As of this patch, it only accepts a filename.)
>  * Patch 4 extends the git clone --bundle-uri=<X> option to allow file://
>    and https:// URIs.
>  * Patch 5 is a CLI helper to avoid using --bundle-uri and --depth at the
>    same time in git clone.
> 
> As outlined in [1], the next steps after this are:
> 
>  1. Allow parsing a bundle list as a config file at the given URI. The
>     key-value format is unified with the protocol v2 verb (coming in (3)).
>     [2]
>  2. Implement the protocol v2 verb, re-using the bundle list logic from (2).
>     Use this to auto-discover bundle URIs during 'git clone' (behind a
>     config option). [3]
>  3. Implement the 'creationToken' heuristic, allowing incremental 'git
>     fetch' commands to download a bundle list from a configured URI, and
>     only download bundles that are new based on the creation token values.
>     [4]
> 
> I have prepared some of this work as pull requests on my personal fork so
> curious readers can look ahead to where we are going:
> 
> [2] https://github.com/derrickstolee/git/pull/20
> 
> [3] https://github.com/derrickstolee/git/pull/21
> 
> [4] https://github.com/derrickstolee/git/pull/22
> 
> Note: this series includes some code pulled out of the first series [1], and
> in particular the git fetch --bundle-uri=<X> option is removed. The
> intention was to replace that with git bundle fetch <X>, but that conflicts
> with the work to refactor how subcommands are parsed. The git bundle fetch
> subcommand could be added later for maximum flexibility on the client side,
> but we can also move forward without it.
> 
> Thanks, -Stolee
> 
> P.S. Here is the range-diff compared to v2 of the previous bundle URI
> series:
> 
> 1:  d444042dc4dcc < -:  ------------- docs: document bundle URI standard
> 2:  0a2cf60437f78 ! 1:  40808e92afb7b remote-curl: add 'get' capability
>     @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
>       
>      +static void parse_get(struct strbuf *buf)
>      +{
>     -+	struct http_get_options opts = { 0 };
>      +	struct strbuf url = STRBUF_INIT;
>      +	struct strbuf path = STRBUF_INIT;
>      +	const char *p, *space;
>     @@ remote-curl.c: static void parse_fetch(struct strbuf *buf)
>      +	strbuf_add(&url, p, space - p);
>      +	strbuf_addstr(&path, space + 1);
>      +
>     -+	if (http_get_file(url.buf, path.buf, &opts))
>     ++	if (http_get_file(url.buf, path.buf, NULL))
>      +		die(_("failed to download file at URL '%s'"), url.buf);
>      +
>      +	strbuf_release(&url);
>     @@ t/t5557-http-get.sh (new)
>      +	get $url file1
>      +	EOF
>      +
>     -+	test_must_fail git remote-http $url $url <input 2>err &&
>     ++	test_must_fail git remote-http $url <input 2>err &&
>      +	test_path_is_missing file1 &&
>      +	grep "failed to download file at URL" err &&
>      +	rm file1.temp
>     @@ t/t5557-http-get.sh (new)
>      +
>      +	EOF
>      +
>     -+	GIT_TRACE2_PERF=1 git remote-http $url $url <input &&
>     ++	GIT_TRACE2_PERF=1 git remote-http $url <input &&
>      +	test_cmp "$HTTPD_DOCUMENT_ROOT_PATH/exists.txt" file2
>      +'
>      +
>      +test_done
>     -
>     - ## transport-helper.c ##
>     -@@ transport-helper.c: struct helper_data {
>     - 		check_connectivity : 1,
>     - 		no_disconnect_req : 1,
>     - 		no_private_update : 1,
>     --		object_format : 1;
>     -+		object_format : 1,
>     -+		get : 1;
>     - 
>     - 	/*
>     - 	 * As an optimization, the transport code may invoke fetch before
>     -@@ transport-helper.c: static struct child_process *get_helper(struct transport *transport)
>     - 			data->no_private_update = 1;
>     - 		} else if (starts_with(capname, "object-format")) {
>     - 			data->object_format = 1;
>     -+		} else if (!strcmp(capname, "get")) {
>     -+			data->get = 1;
>     - 		} else if (mandatory) {
>     - 			die(_("unknown mandatory capability %s; this remote "
>     - 			      "helper probably needs newer version of Git"),
> 3:  abec47564fd9c ! 2:  7d3159f0d9a29 bundle-uri: create basic file-copy logic
>     @@ Commit message
>          file, not a bundle list. Bundle lists will be implemented in a future
>          change.
>      
>     +    Note that the discovery of a temporary filename is slightly racy because
>     +    the odb_mkstemp() relies on the temporary file not existing. With the
>     +    current implementation being limited to file copies, we could replace
>     +    the copy_file() with copy_fd(). The tricky part comes in future changes
>     +    that send the filename to 'git remote-https' and its 'get' capability.
>     +    At that point, we need the file descriptor closed _and_ the file
>     +    unlinked. If we were to keep the file descriptor open for the sake of
>     +    normal file copies, then we would pollute the rest of the code for
>     +    little benefit. This is especially the case because we expect that most
>     +    bundle URI use will be based on HTTPS instead of file copies.
>     +
>          Signed-off-by: Derrick Stolee <derrickstolee@xxxxxxxxxx>
>      
>       ## Makefile ##
>     @@ bundle-uri.c (new)
>      +#include "refs.h"
>      +#include "run-command.h"
>      +
>     -+static void find_temp_filename(struct strbuf *name)
>     ++static int find_temp_filename(struct strbuf *name)
>      +{
>      +	int fd;
>      +	/*
>     @@ bundle-uri.c (new)
>      +	 * racy, but unlikely to collide.
>      +	 */
>      +	fd = odb_mkstemp(name, "bundles/tmp_uri_XXXXXX");
>     -+	if (fd < 0)
>     -+		die(_("failed to create temporary file"));
>     ++	if (fd < 0) {
>     ++		warning(_("failed to create temporary file"));
>     ++		return -1;
>     ++	}
>     ++
>      +	close(fd);
>      +	unlink(name->buf);
>     ++	return 0;
>      +}
>      +
>     -+static int copy_uri_to_file(const char *uri, const char *file)
>     ++static int copy_uri_to_file(const char *file, const char *uri)
>      +{
>     -+	/* Copy as a file */
>     -+	return copy_file(file, uri, 0444);
>     ++	/* File-based URIs only for now. */
>     ++	return copy_file(file, uri, 0);
>      +}
>      +
>      +static int unbundle_from_file(struct repository *r, const char *file)
>     @@ bundle-uri.c (new)
>      +	int result = 0;
>      +	int bundle_fd;
>      +	struct bundle_header header = BUNDLE_HEADER_INIT;
>     -+	struct strvec extra_index_pack_args = STRVEC_INIT;
>      +	struct string_list_item *refname;
>      +	struct strbuf bundle_ref = STRBUF_INIT;
>      +	size_t bundle_prefix_len;
>     @@ bundle-uri.c (new)
>      +	if ((bundle_fd = read_bundle_header(file, &header)) < 0)
>      +		return 1;
>      +
>     -+	if ((result = unbundle(r, &header, bundle_fd, &extra_index_pack_args)))
>     ++	if ((result = unbundle(r, &header, bundle_fd, NULL)))
>      +		return 1;
>      +
>      +	/*
>     @@ bundle-uri.c (new)
>      +	int result = 0;
>      +	struct strbuf filename = STRBUF_INIT;
>      +
>     -+	find_temp_filename(&filename);
>     -+	if ((result = copy_uri_to_file(uri, filename.buf)))
>     ++	if ((result = find_temp_filename(&filename)))
>     ++		goto cleanup;
>     ++
>     ++	if ((result = copy_uri_to_file(filename.buf, uri))) {
>     ++		warning(_("failed to download bundle from URI '%s'"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>     -+	if ((result = !is_bundle(filename.buf, 0)))
>     ++	if ((result = !is_bundle(filename.buf, 0))) {
>     ++		warning(_("file at URI '%s' is not a bundle"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>     -+	if ((result = unbundle_from_file(r, filename.buf)))
>     ++	if ((result = unbundle_from_file(r, filename.buf))) {
>     ++		warning(_("failed to unbundle bundle from URI '%s'"), uri);
>      +		goto cleanup;
>     ++	}
>      +
>      +cleanup:
>      +	unlink(filename.buf);
> 4:  f6255ec518857 < -:  ------------- fetch: add --bundle-uri option
> -:  ------------- > 3:  29e645a54ba7f clone: add --bundle-uri option
> 5:  bfbd11b48bf1b ! 4:  f6bc3177332e8 bundle-uri: add support for http(s):// and file://
>     @@ Metadata
>       ## Commit message ##
>          bundle-uri: add support for http(s):// and file://
>      
>     -    The previous change created the 'git fetch --bundle-uri=<uri>' option.
>     +    The previous change created the 'git clone --bundle-uri=<uri>' option.
>          Currently, <uri> must be a filename.
>      
>          Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
>     @@ Commit message
>          Signed-off-by: Derrick Stolee <derrickstolee@xxxxxxxxxx>
>      
>       ## bundle-uri.c ##
>     -@@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
>     - 	unlink(name->buf);
>     +@@ bundle-uri.c: static int find_temp_filename(struct strbuf *name)
>     + 	return 0;
>       }
>       
>     -+static int download_https_uri_to_file(const char *uri, const char *file)
>     -+{
>     +-static int copy_uri_to_file(const char *file, const char *uri)
>     ++static int download_https_uri_to_file(const char *file, const char *uri)
>     + {
>     +-	/* File-based URIs only for now. */
>     +-	return copy_file(file, uri, 0);
>      +	int result = 0;
>      +	struct child_process cp = CHILD_PROCESS_INIT;
>      +	FILE *child_in = NULL, *child_out = NULL;
>     @@ bundle-uri.c: static void find_temp_filename(struct strbuf *name)
>      +	return result;
>      +}
>      +
>     - static int copy_uri_to_file(const char *uri, const char *file)
>     - {
>     ++static int copy_uri_to_file(const char *filename, const char *uri)
>     ++{
>      +	const char *out;
>     ++
>      +	if (skip_prefix(uri, "https:", &out) ||
>      +	    skip_prefix(uri, "http:", &out))
>     -+		return download_https_uri_to_file(uri, file);
>     ++		return download_https_uri_to_file(filename, uri);
>      +
>      +	if (!skip_prefix(uri, "file://", &out))
>      +		out = uri;
>      +
>     - 	/* Copy as a file */
>     --	return copy_file(file, uri, 0444);
>     -+	return !!copy_file(file, out, 0);
>     ++	/* Copy as a file */
>     ++	return copy_file(filename, out, 0);
>       }
>       
>       static int unbundle_from_file(struct repository *r, const char *file)
>      
>     - ## t/t5558-fetch-bundle-uri.sh ##
>     -@@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
>     + ## t/t5558-clone-bundle-uri.sh ##
>     +@@ t/t5558-clone-bundle-uri.sh: test_expect_success 'clone with path bundle' '
>           test_cmp expect actual
>       '
>       
>     -+test_expect_success 'fetch file:// bundle' '
>     -+	git init fetch-file &&
>     -+	git -C fetch-file fetch --bundle-uri="file://$(pwd)/fetch-from/B.bundle" &&
>     -+	git -C fetch-file rev-parse refs/bundles/topic >actual &&
>     -+	git -C fetch-from rev-parse topic >expect &&
>     ++test_expect_success 'clone with file:// bundle' '
>     ++	git clone --bundle-uri="file://$(pwd)/clone-from/B.bundle" \
>     ++		clone-from clone-file &&
>     ++	git -C clone-file rev-parse refs/bundles/topic >actual &&
>     ++	git -C clone-from rev-parse topic >expect &&
>      +	test_cmp expect actual
>      +'
>      +
>     @@ t/t5558-fetch-bundle-uri.sh: test_expect_success 'fetch file bundle' '
>      +start_httpd
>      +
>      +test_expect_success 'fail to fetch from non-existent HTTP URL' '
>     -+	test_must_fail git fetch --bundle-uri="$HTTPD_URL/does-not-exist" 2>err &&
>     ++	test_when_finished rm -rf test &&
>     ++	git clone --bundle-uri="$HTTPD_URL/does-not-exist" . test 2>err &&
>      +	grep "failed to download bundle from URI" err
>      +'
>      +
>      +test_expect_success 'fail to fetch from non-bundle HTTP URL' '
>     ++	test_when_finished rm -rf test &&
>      +	echo bogus >"$HTTPD_DOCUMENT_ROOT_PATH/bogus" &&
>     -+	test_must_fail git fetch --bundle-uri="$HTTPD_URL/bogus" 2>err &&
>     ++	git clone --bundle-uri="$HTTPD_URL/bogus" . test 2>err &&
>      +	grep "is not a bundle" err
>      +'
>      +
>     -+test_expect_success 'fetch HTTP bundle' '
>     -+	cp fetch-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
>     -+	git init fetch-http &&
>     -+	git -C fetch-http fetch --bundle-uri="$HTTPD_URL/B.bundle" &&
>     -+	git -C fetch-http rev-parse refs/bundles/topic >actual &&
>     -+	git -C fetch-from rev-parse topic >expect &&
>     -+	test_cmp expect actual
>     ++test_expect_success 'clone HTTP bundle' '
>     ++	cp clone-from/B.bundle "$HTTPD_DOCUMENT_ROOT_PATH/B.bundle" &&
>     ++
>     ++	git clone --no-local --mirror clone-from \
>     ++		"$HTTPD_DOCUMENT_ROOT_PATH/fetch.git" &&
>     ++
>     ++	git clone --bundle-uri="$HTTPD_URL/B.bundle" \
>     ++		"$HTTPD_URL/smart/fetch.git" clone-http &&
>     ++	git -C clone-http rev-parse refs/bundles/topic >actual &&
>     ++	git -C clone-from rev-parse topic >expect &&
>     ++	test_cmp expect actual &&
>     ++
>     ++	test_config -C clone-http log.excludedecoration refs/bundle/
>      +'
>      +
>      +# Do not add tests here unless they use the HTTP server, as they will
> 6:  a217e9a0640b4 < -:  ------------- fetch: add 'refs/bundle/' to log.excludeDecoration
> -:  ------------- > 5:  e823b168ab725 clone: --bundle-uri cannot be combined with --depth
> 
> 
> Derrick Stolee (5):
>   remote-curl: add 'get' capability
>   bundle-uri: create basic file-copy logic
>   clone: add --bundle-uri option
>   bundle-uri: add support for http(s):// and file://
>   clone: --bundle-uri cannot be combined with --depth
> 
>  Documentation/git-clone.txt         |   7 ++
>  Documentation/gitremote-helpers.txt |   9 ++
>  Makefile                            |   1 +
>  builtin/clone.c                     |  18 +++
>  bundle-uri.c                        | 168 ++++++++++++++++++++++++++++
>  bundle-uri.h                        |  14 +++
>  remote-curl.c                       |  32 ++++++
>  t/t5557-http-get.sh                 |  37 ++++++
>  t/t5558-clone-bundle-uri.sh         |  81 ++++++++++++++
>  t/t5606-clone-options.sh            |   8 ++
>  10 files changed, 375 insertions(+)
>  create mode 100644 bundle-uri.c
>  create mode 100644 bundle-uri.h
>  create mode 100755 t/t5557-http-get.sh
>  create mode 100755 t/t5558-clone-bundle-uri.sh
> 
> 
> base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1300%2Fderrickstolee%2Fbundle-redo%2Fclone-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1300/derrickstolee/bundle-redo/clone-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1300
> -- 
> gitgitgadget

Looks good to me, with just one small fix and a couple of optional
nitpicks. Thaks for the series!

Reviewed-by: Josh Steadmon <steadmon@xxxxxxxxxx>



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux