On Wed, Jul 24, 2024 at 04:49:57PM +0200, Toon Claes wrote: > One way to achieve this is possible when the "creationToken" heuristic > is used for bundle URIs. We attempt to download and unbundle the minimum > number of bundles by creationToken in decreasing order. If we fail to > unbundle (after a successful download) then move to the next > non-downloaded bundle and attempt downloading. Once we succeed in > applying a bundle, move to the previous unapplied bundle and attempt to > unbundle it again. At the end the highest applied creationToken is > written to `fetch.bundleCreationToken` in the git-config. The next time > bundles are advertised by the server, bundles with a lower creationToken > value are ignored. This was already implemented by > 7903efb717 (bundle-uri: download in creationToken order, 2023-01-31) in > fetch_bundles_by_token(). I think Junio essentially asked this already, but I'm still missing the bigger picture here. When the "creationToken" heuristic is applied, the effect of your change is that we'll always favor bundle URIs now over performing proper fetches, right? Now suppose that the server creates new bundled whenever somebody pushes a new change to the default branch. We do not really have information how this bundle is structured. It _could_ be an incremental bundle, and in that case it might be sensible to fetch that bundle. But it could also be that the server generates a full bundle including all objects transitively reachable from that default branch. Now if we started to rely on the "creationToken" heuristic, we would basically end up re-downloading the complete repository, which is a strict regression. Now that scenario is of course hypothetical. But the problem is that the strategy for how bundle URIs are generated are determined by the hosting provider. So ultimately, I expect that the reality will lie somewhere in between and be different depending on which hosting solution you use. All of this to me means that the "creationToken" heuristic is not really a good signal, unless I'm missing something about the way it works. Is there any additional signal provided by the server except for the time when the bundle was created? If so, is that information sufficient to determine whether it makes sense for a client to fetch a bundle instead of performing a "proper" fetch? If not, what is the additional info that we would need to make this schema work properly? So unless I'm missing something, I feel like we need to think bigger and design a heuristic that gives us the information needed. Without such a heuristic, default-enabling may or may not do the right thing, and we have no way to really argue whether it will do as we now depend on server operators to do the right thing. Patrick
Attachment:
signature.asc
Description: PGP signature