On 6/15/24 6:01 AM, Karthik Nayak wrote:
Jeff King <peff@xxxxxxxx> writes:
On Mon, Jun 10, 2024 at 02:25:19PM -0400, matthew sporleder wrote:
I have recently been playing with git clone --bundle-uri and loving it
because I can clone with almost-*zero* resources being used on the
server!
I am a little confused by https://git-scm.com/docs/bundle-uri
mentioning "discovery" and things. Is this something being added to
the git cli, a special feature for other clients, or is it still too
early-days to talk about much?
I would love to produce bundles of common use cases and have them
auto-discovered by git clone *without* the --bundle-uri parameter, and
then let our CDN do the heavy lifting of satisfying things like:
git clone
git clone --depth=0
git clone --single-branch --branch main
I'm not sure I hold out as much hope for pre-bundling pulls/updates
but any movement towards offloading our big-ish repos to CDNs is a win
for us.
I don't think the server side is well documented, but peeking at the
code, I think you want this on the server:
git config uploadpack.advertiseBundleURIs true
git config bundle.version 1
git config bundle.mode any
git config bundle.foo.uri https://example.com/your.bundle
And then the clients need to tell Git that they allow bundle transfers:
git config --global transfer.bundleURI true
I'm not sure if we'd eventually flip the client-side switch to "true" by
default (which is what you'd need for this to happen without any user
participation at all).
This would indeed be nice. We at GitLab have been experimenting with
bundle-uri. While it is easy to flip the switch for clients under our
control (CI pipelines). End users loose out on these benefits, especially
for large monorepos where the servers spend a lot of time computing the
packfile.
One gotcha there is that clients are now accessing an arbitrary URL
provided by the server, so there are cross-site security implications.
It might make more sense to allow only relative URLs without ".." (so if
I fetched from https://example.com/foo.git, the server could use only
the relative "bundles/bar.bundle", which would then be found at
https://example.com/foo.git/bundles/bar.bundle").
-Peff
True. But I suspect servers using bundle uri might not always serve them
from the same domain. I know we were experimenting using cloud storage
and providing the client with a one-time signed URL.
https://cloud.google.com/storage/docs/access-control/signed-urls
We (at Bitbucket) have implemented bundle server for serving bundles
with expiring URL from cloud storage. It will be nice to have bundle
server discovery based on git v2 protocol based capability exchange.
example pkt format:
007bbundle-server=https://cdn-1.bitbucket.org/workspace/repository/bundle,https://bitbucket.org/workspace/repository/bundle
-Dhruva