Re: [PATCH v2 3/3] fetch: use bundle URIs when having creationToken heuristic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 24, 2024 at 04:49:57PM +0200, Toon Claes wrote:
> One way to achieve this is possible when the "creationToken" heuristic
> is used for bundle URIs. We attempt to download and unbundle the minimum
> number of bundles by creationToken in decreasing order. If we fail to
> unbundle (after a successful download) then move to the next
> non-downloaded bundle and attempt downloading. Once we succeed in
> applying a bundle, move to the previous unapplied bundle and attempt to
> unbundle it again. At the end the highest applied creationToken is
> written to `fetch.bundleCreationToken` in the git-config. The next time
> bundles are advertised by the server, bundles with a lower creationToken
> value are ignored. This was already implemented by
> 7903efb717 (bundle-uri: download in creationToken order, 2023-01-31) in
> fetch_bundles_by_token().

I think Junio essentially asked this already, but I'm still missing the
bigger picture here. When the "creationToken" heuristic is applied, the
effect of your change is that we'll always favor bundle URIs now over
performing proper fetches, right?

Now suppose that the server creates new bundled whenever somebody pushes
a new change to the default branch. We do not really have information
how this bundle is structured. It _could_ be an incremental bundle, and
in that case it might be sensible to fetch that bundle. But it could
also be that the server generates a full bundle including all objects
transitively reachable from that default branch. Now if we started to
rely on the "creationToken" heuristic, we would basically end up
re-downloading the complete repository, which is a strict regression.

Now that scenario is of course hypothetical. But the problem is that the
strategy for how bundle URIs are generated are determined by the hosting
provider. So ultimately, I expect that the reality will lie somewhere in
between and be different depending on which hosting solution you use.

All of this to me means that the "creationToken" heuristic is not really
a good signal, unless I'm missing something about the way it works. Is
there any additional signal provided by the server except for the time
when the bundle was created? If so, is that information sufficient to
determine whether it makes sense for a client to fetch a bundle instead
of performing a "proper" fetch? If not, what is the additional info that
we would need to make this schema work properly?

So unless I'm missing something, I feel like we need to think bigger and
design a heuristic that gives us the information needed. Without such a
heuristic, default-enabling may or may not do the right thing, and we
have no way to really argue whether it will do as we now depend on
server operators to do the right thing.

Patrick

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux