On Wed, Feb 01, 2017 at 10:32:12AM +0100, Erik van Zijst wrote: > Clients performing a full clone get redirected to a CDN where they seed > their new local repo from a pre-built bundle file, and then pull/fetch > any remaining changes. Mercurial has had native, built-in support for > this for a while now. > > I imagine other large code hosts could benefit from this as well and > I'd love to gauge the group's interest for this. Could this make sense > for Git? Would it have a chance of landing? > > Our spike implements it as an optional capability during ref > advertisement. What are your thoughts on this? I think this is definitely an interesting topic to discuss tomorrow. Here are a few observations from my past thinking on the issue. I haven't read the proposal from earlier this week yet, so some of them may be obsolete. Seeding from a bundle CDN generally solves two problems: getting the bulk of the data from someplace with higher bandwidth (the CDN), and getting the bulk of the data over a protocol that can be resumed (the bundle). But we don't necessarily have to solve both problems simultaneously. And you might not want to. Storing a separate bundle on another server is complicated to configure, and doubles the amount of disk space you need (just half of it is on the CDN). Using a bundle means you can't seed from a non-bundle source. So for any solution, I'd want to consider how you can put together the pieces. Can you seed from a non-bundle? Can you seed from yourself and just get resumability? If so, how hard is it to serve a pseudo-bundle based on the packfiles you have on disk (i.e., getting resumability at least in the common cases without paying the disk cost). I.e., saving enough data that you could reconstruct the bundle byte-for-byte when you need to. If you _can_ do that latter part, and you take "I only care about resumability" to the simplest extreme, you'd probably end up with a protocol more like: Client: I need a packfile with this want/have Server: OK, here it is; its opaque id is XYZ. ... connection interrupted ... Client: It's me again. I have up to byte N of pack XYZ Server: OK, resuming [or: I don't have XYZ anymore; start from scratch] Then generating XYZ and generating that bundle are basically the same task. All just food for thought. I look forward to digging into it more on the list and in the in-person discussion. -Peff