On 2022.07.25 13:53, Derrick Stolee via GitGitGadget wrote: > This is the first of series towards building the bundle URI feature as > discussed in previous RFCs, specifically pulled directly out of [5]: > > [1] > https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@xxxxxxxxx/ > [2] > https://lore.kernel.org/git/cover-0.3-00000000000-20211025T211159Z-avarab@xxxxxxxxx/ > [3] > https://lore.kernel.org/git/pull.1160.git.1645641063.gitgitgadget@xxxxxxxxx > [4] > https://lore.kernel.org/git/RFC-cover-v2-00.36-00000000000-20220418T165545Z-avarab@xxxxxxxxx/ > [5] > https://lore.kernel.org/git/pull.1234.git.1653072042.gitgitgadget@xxxxxxxxx > > THIS ONLY INCLUDES THE DESIGN DOCUMENT. See "Updates in v3". There are two > patches: > > 1. The main design document that details the bundle URI standard and how > the client interacts with the bundle data. > 2. An addendum to the design document that details one strategy for > organizing bundles from the perspective of a bundle provider. > > As outlined in [5], the next steps after this are: > > 1. Add 'git clone --bundle-uri=' to run a 'git bundle fetch ' step before > doing a fetch negotiation with the origin remote. [6] > 2. Allow parsing a bundle list as a config file at the given URI. The > key-value format is unified with the protocol v2 verb (coming in (3)). > [7] > 3. Implement the protocol v2 verb, re-using the bundle list logic from (2). > Use this to auto-discover bundle URIs during 'git clone' (behind a > config option). [8] > 4. Implement the 'creationToken' heuristic, allowing incremental 'git > fetch' commands to download a bundle list from a configured URI, and > only download bundles that are new based on the creation token values. > [9] > > I have prepared some of this work as pull requests on my personal fork so > curious readers can look ahead to where we are going: > > [6] https://github.com/derrickstolee/git/pull/18 [7] > https://github.com/derrickstolee/git/pull/20 [8] > https://github.com/derrickstolee/git/pull/21 [9] > https://github.com/derrickstolee/git/pull/22 > > As mentioned in the design document, this is not all that is possible. For > instance, Ævar's suggestion to download only the bundle headers can be used > as a second heuristic (and as an augmentation of the timestamp heuristic). > > > Updates in v3 > ============= > > * This version only includes the design document. Thanks to all the > reviewers for the significant attention that improves the doc a lot. > * The second patch has an addition to the design document that details a > potential way to organize bundles from the provider's perspective. > * Based on some off-list feedback, I was going to switch git fetch > --bundle-uri into git bundle fetch, but that has a major conflict with > [10] which was just submitted. > * I will move the git bundle fetch implementation into [6] which also has > the git clone --bundle-uri implementation. [10] > https://lore.kernel.org/git/20220725123857.2773963-1-szeder.dev@xxxxxxxxx/ > > > Updates in v2 > ============= > > * The design document has been updated based on Junio's feedback. > * The "bundle.list." keys are now just "bundle.". > * The "timestamp" heuristic is now "creationToken". > * More clarity on how Git parses data from the bundle URI. > * Dropped some unnecessary bundle list keys (*.list, *.requires). > > Thanks, -Stolee > > Derrick Stolee (2): > docs: document bundle URI standard > bundle-uri: add example bundle organization > > Documentation/Makefile | 1 + > Documentation/technical/bundle-uri.txt | 573 +++++++++++++++++++++++++ > 2 files changed, 574 insertions(+) > create mode 100644 Documentation/technical/bundle-uri.txt > > > base-commit: e72d93e88cb20b06e88e6e7d81bd1dc4effe453f > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1248%2Fderrickstolee%2Fbundle-redo%2Ffetch-v3 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1248/derrickstolee/bundle-redo/fetch-v3 > Pull-Request: https://github.com/gitgitgadget/git/pull/1248 > > Range-diff vs v2: > > 1: d444042dc4d ! 1: e0f003e1b5f docs: document bundle URI standard > @@ Commit message > > Signed-off-by: Derrick Stolee <derrickstolee@xxxxxxxxxx> > > + ## Documentation/Makefile ## > +@@ Documentation/Makefile: TECH_DOCS += SubmittingPatches > + TECH_DOCS += ToolsForGit > + TECH_DOCS += technical/bitmap-format > + TECH_DOCS += technical/bundle-format > ++TECH_DOCS += technical/bundle-uri > + TECH_DOCS += technical/cruft-packs > + TECH_DOCS += technical/hash-function-transition > + TECH_DOCS += technical/http-protocol > + > ## Documentation/technical/bundle-uri.txt (new) ## > @@ > +Bundle URIs > +=========== > + > ++Git bundles are files that store a pack-file along with some extra metadata, > ++including a set of refs and a (possibly empty) set of necessary commits. See > ++linkgit:git-bundle[1] and link:bundle-format.txt[the bundle format] for more > ++information. > ++ > +Bundle URIs are locations where Git can download one or more bundles in > +order to bootstrap the object database in advance of fetching the remaining > +objects from a remote. > @@ Documentation/technical/bundle-uri.txt (new) > + If this string-valued key exists, then the bundle list is designed to > + work well with incremental `git fetch` commands. The heuristic signals > + that there are additional keys available for each bundle that help > -+ determine which subset of bundles the client should download. > ++ determine which subset of bundles the client should download. The only > ++ heuristic currently planned is `creationToken`. > + > +The remaining keys include an `<id>` segment which is a server-designated > -+name for each available bundle. > ++name for each available bundle. The `<id>` must contain only alphanumeric > ++and `-` characters. > + > +bundle.<id>.uri:: > + (Required) This string value is the URI for downloading bundle `<id>`. > @@ Documentation/technical/bundle-uri.txt (new) > + > +Here is an example bundle list using the Git config format: > + > -+``` > -+[bundle] > -+ version = 1 > -+ mode = all > -+ heuristic = creationToken > ++ [bundle] > ++ version = 1 > ++ mode = all > ++ heuristic = creationToken > + > -+[bundle "2022-02-09-1644442601-daily"] > -+ uri = https://bundles.example.com/git/git/2022-02-09-1644442601-daily.bundle > -+ timestamp = 1644442601 > ++ [bundle "2022-02-09-1644442601-daily"] > ++ uri = https://bundles.example.com/git/git/2022-02-09-1644442601-daily.bundle > ++ creationToken = 1644442601 > + > -+[bundle "2022-02-02-1643842562"] > -+ uri = https://bundles.example.com/git/git/2022-02-02-1643842562.bundle > -+ timestamp = 1643842562 > ++ [bundle "2022-02-02-1643842562"] > ++ uri = https://bundles.example.com/git/git/2022-02-02-1643842562.bundle > ++ creationToken = 1643842562 > + > -+[bundle "2022-02-09-1644442631-daily-blobless"] > -+ uri = 2022-02-09-1644442631-daily-blobless.bundle > -+ timestamp = 1644442631 > -+ filter = blob:none > ++ [bundle "2022-02-09-1644442631-daily-blobless"] > ++ uri = 2022-02-09-1644442631-daily-blobless.bundle > ++ creationToken = 1644442631 > ++ filter = blob:none > + > -+[bundle "2022-02-02-1643842568-blobless"] > -+ uri = /git/git/2022-02-02-1643842568-blobless.bundle > -+ timestamp = 1643842568 > -+ filter = blob:none > -+``` > ++ [bundle "2022-02-02-1643842568-blobless"] > ++ uri = /git/git/2022-02-02-1643842568-blobless.bundle > ++ creationToken = 1643842568 > ++ filter = blob:none > + > +This example uses `bundle.mode=all` as well as the > +`bundle.<id>.creationToken` heuristic. It also uses the `bundle.<id>.filter` > @@ Documentation/technical/bundle-uri.txt (new) > +* The client fails to connect with a server at the given URI or a connection > + is lost without any chance to recover. > + > -+* The client receives a response other than `200 OK` (such as `404 Not Found`, > -+ `401 Not Authorized`, or `500 Internal Server Error`). The client should > -+ use the `credential.helper` to attempt authentication after the first > -+ `401 Not Authorized` response, but a second such response is a failure. > ++* The client receives a 400-level response (such as `404 Not Found` or > ++ `401 Not Authorized`). The client should use the credential helper to > ++ find and provide a credential for the URI, but match the semantics of > ++ Git's other HTTP protocols in terms of handling specific 400-level > ++ errors. > + > -+* The client receives data that is not parsable as a bundle or bundle list. > ++* The server reports any other failure reponse. > + > -+* The bundle list describes a directed cycle in the > -+ `bundle.<id>.requires` links. > ++* The client receives data that is not parsable as a bundle or bundle list. > + > +* A bundle includes a filter that does not match expectations. > + > +* The client cannot unbundle the bundles because the prerequisite commit OIDs > -+ are not in the object database and there are no more > -+ `bundle.<id>.requires` links to follow. > ++ are not in the object database and there are no more bundles to download. > + > +There are also situations that could be seen as wasteful, but are not > +error conditions: > @@ Documentation/technical/bundle-uri.txt (new) > + the client is using hourly prefetches with background maintenance, but > + the server is computing bundles weekly. For this reason, the client > + should not use bundle URIs for fetch unless the server has explicitly > -+ recommended it through the `bundle.flags = forFetch` value. > ++ recommended it through a `bundle.heuristic` value. > + > +Implementation Plan > +------------------- > @@ Documentation/technical/bundle-uri.txt (new) > + that the config format parsing feeds a list of key-value pairs into the > + bundle list logic. > + > -+3. Create the `bundle-uri` protocol v2 verb so Git servers can advertise > ++3. Create the `bundle-uri` protocol v2 command so Git servers can advertise > + bundle URIs using the key-value pairs. Plug into the existing key-value > + input to the bundle list logic. Allow `git clone` to discover these > + bundle URIs and bootstrap the client repository from the bundle data. > 2: 0a2cf60437f < -: ----------- remote-curl: add 'get' capability > 3: abec47564fd < -: ----------- bundle-uri: create basic file-copy logic > 4: f6255ec5188 < -: ----------- fetch: add --bundle-uri option > 5: bfbd11b48bf < -: ----------- bundle-uri: add support for http(s):// and file:// > 6: a217e9a0640 < -: ----------- fetch: add 'refs/bundle/' to log.excludeDecoration > -: ----------- > 2: a933471c3af bundle-uri: add example bundle organization > > -- > gitgitgadget Looks good to me, thanks for the series! Reviewed-by: Josh Steadmon <steadmon@xxxxxxxxxx>