Hi Stolee On 09/08/2022 14:12, Derrick Stolee via GitGitGadget wrote:
This is the first of series towards building the bundle URI feature as discussed in previous RFCs, specifically pulled directly out of [5]: [1] https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@xxxxxxxxx/ [2] https://lore.kernel.org/git/cover-0.3-00000000000-20211025T211159Z-avarab@xxxxxxxxx/ [3] https://lore.kernel.org/git/pull.1160.git.1645641063.gitgitgadget@xxxxxxxxx [4] https://lore.kernel.org/git/RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@xxxxxxxxx/ [5] https://lore.kernel.org/git/pull.1234.git.1653072042.gitgitgadget@xxxxxxxxx THIS ONLY INCLUDES THE DESIGN DOCUMENT. See "Updates in v3". There are two patches: 1. The main design document that details the bundle URI standard and how the client interacts with the bundle data. 2. An addendum to the design document that details one strategy for organizing bundles from the perspective of a bundle provider.
I thought the document was well written and left me with a good understanding of both the problem being addressed and the rationale for the solution. One small query - the document mentions CI farms as benefiting from this work but my impression is that those commonly use shallow clones which are (quite reasonably) not supported in this proposal.
Best Wishes Phillip
As outlined in [5], the next steps after this are: 1. Add 'git clone --bundle-uri=' to run a 'git bundle fetch ' step before doing a fetch negotiation with the origin remote. [6] 2. Allow parsing a bundle list as a config file at the given URI. The key-value format is unified with the protocol v2 verb (coming in (3)). [7] 3. Implement the protocol v2 verb, re-using the bundle list logic from (2). Use this to auto-discover bundle URIs during 'git clone' (behind a config option). [8] 4. Implement the 'creationToken' heuristic, allowing incremental 'git fetch' commands to download a bundle list from a configured URI, and only download bundles that are new based on the creation token values. [9] I have prepared some of this work as pull requests on my personal fork so curious readers can look ahead to where we are going: [6] https://github.com/derrickstolee/git/pull/18 [7] https://github.com/derrickstolee/git/pull/20 [8] https://github.com/derrickstolee/git/pull/21 [9] https://github.com/derrickstolee/git/pull/22 As mentioned in the design document, this is not all that is possible. For instance, Ævar's suggestion to download only the bundle headers can be used as a second heuristic (and as an augmentation of the timestamp heuristic). Updates in v4 ============= * Whitespace issue resolved. * Example bundle provider setup now uses the 'bundle-uri' protocol v2 format when describing how the origin Git server advertises the static bundle servers. Updates in v3 ============= * This version only includes the design document. Thanks to all the reviewers for the significant attention that improves the doc a lot. * The second patch has an addition to the design document that details a potential way to organize bundles from the provider's perspective. * Based on some off-list feedback, I was going to switch git fetch --bundle-uri into git bundle fetch, but that has a major conflict with [10] which was just submitted. * I will move the git bundle fetch implementation into [6] which also has the git clone --bundle-uri implementation. [10] https://lore.kernel.org/git/20220725123857.2773963-1-szeder.dev@xxxxxxxxx/ Updates in v2 ============= * The design document has been updated based on Junio's feedback. * The "bundle.list." keys are now just "bundle.". * The "timestamp" heuristic is now "creationToken". * More clarity on how Git parses data from the bundle URI. * Dropped some unnecessary bundle list keys (*.list, *.requires). Thanks, -Stolee Derrick Stolee (2): docs: document bundle URI standard bundle-uri: add example bundle organization Documentation/Makefile | 1 + Documentation/technical/bundle-uri.txt | 573 +++++++++++++++++++++++++ 2 files changed, 574 insertions(+) create mode 100644 Documentation/technical/bundle-uri.txt base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1248%2Fderrickstolee%2Fbundle-redo%2Ffetch-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1248/derrickstolee/bundle-redo/fetch-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/1248 Range-diff vs v3: 1: e0f003e1b5f ! 1: 1bfac1f492a docs: document bundle URI standard @@ Documentation/technical/bundle-uri.txt (new) + work well with incremental `git fetch` commands. The heuristic signals + that there are additional keys available for each bundle that help + determine which subset of bundles the client should download. The only -+ heuristic currently planned is `creationToken`. ++ heuristic currently planned is `creationToken`. + +The remaining keys include an `<id>` segment which is a server-designated +name for each available bundle. The `<id>` must contain only alphanumeric 2: a933471c3af ! 2: a22c24aa85a bundle-uri: add example bundle organization @@ Documentation/technical/bundle-uri.txt: error conditions: + [bundle] + version = 1 + mode = any -+ ++ + [bundle "eastus"] + uri = https://eastus.example.com/<domain>/<org>/<repo> -+ ++ + [bundle "europe"] + uri = https://europe.example.com/<domain>/<org>/<repo> -+ ++ + [bundle "apac"] + uri = https://apac.example.com/<domain>/<org>/<repo> +