On Mon, Jun 10, 2024 at 11:27 PM Jeff King <peff@xxxxxxxx> wrote: > > On Mon, Jun 10, 2024 at 12:04:30PM -0700, Emily Shaffer wrote: > > > > One strategy people have worked on is for servers to point clients at > > > static packfiles (which _do_ remain byte-for-byte identical, and can be > > > resumed) to get some of the objects. But it requires some scheme on the > > > server side to decide when and how to create those packfiles. So while > > > there is support inside Git itself for this idea (both on the server and > > > client side), I don't know of any servers where it is in active use. > > > > We use packfile offloading heavily at Google (any repositories hosted > > at *.googlesource.com, as well as our internal-facing hosting). It > > works quite well for us scaling large projects like Android and > > Chrome; we've been using it for some time now and are happy with it. > > Cool! I'm glad to hear it is in use. > > It might be helpful for other potential users if you can share how you > decide when to create the off-loaded packfiles, what goes in them, and > so on. IIRC the server-side config is mostly geared at stuffing a few > large blobs into a pack (since each blob must have an individual config > key). Maybe JGit (which I'm assuming is what powers googlesource) has > better options there. IIRC the upstream conf was oriented to offload individual blobs. In JGit/Google we do the offloading at pack level. We write to storage and CDN when creating a pack and keep the offloaded location in the pack metadata. We do this only in certain conditions (GC, above a certain size,...). At serving time, if we see that we need to send a pack "as-is" (all objects inside are needed) and it has an offload, then we mark it to send the URL instead of the contents. As the offload is just a copy of the pack, we can use the pack bitmap to know what is there or not. > > However, one thing that's missing is the resumable download Ellie is > > describing. Another thing missing in the offload story is supporting offloads in non-http protocols. e.g. after cloning via my-protocol://, being able to fetch my-protocol://blah/blah urls. Ivan