Re: [PATCH v2 4/4] bundle v3: the beginning

Christian Couder <christian.couder@xxxxxxxxx> · Tue, 7 Jun 2016 15:19:46 +0200

On Wed, Jun 1, 2016 at 12:31 AM, Jeff King <peff@xxxxxxxx> wrote:
> On Fri, May 20, 2016 at 02:39:06PM +0200, Christian Couder wrote:
>
>> I wonder if this mechanism could also be used or extended to clone and
>> fetch an alternate object database.
>>
>> In [1], [2] and [3], and this was also discussed during the
>> Contributor Summit last month, Peff says that he started working on
>> alternate object database support a long time ago, and that the hard
>> part is a protocol extension to tell remotes that you can access some
>> objects in a different way.
>>
>> If a Git client would download a "$name.bndl" v3 bundle file that
>> would have a "data: $URL/alt-odb-$name.odb" extended header, the Git
>> client would just need to download "$URL/alt-odb-$name.odb" and use
>> the alternate object database support on this file.
>>
>> This way it would know all it has to know to access the objects in the
>> alternate database. The alternate object database may not contain the
>> real objects, if they are too big for example, but just files that
>> describe how to get the real objects.
>
> I'm not sure about this strategy.

I am also not sure that this is the best strategy, but I think it's
worth discussing.

> I see two complications:
>
>   1. I don't think bundles need to be a part of this "external odb"
>      strategy at all. If I understand correctly, I think you want to use
>      it as a place to stuff metadata that the server tells the client,
>      like "by the way, go here if you want another way to access some
>      objects".

Yeah, basically I think it might be possible to use the bundle
mechanism to transfer what an external ODB on the client would need to
be initialized or updated.

>      But there are lots of cases where the server might want to tell
>      the client that don't involve bundles at all.

The idea is also that anytime the server needs to send external ODB
data to the client, it would ask its own external ODB to prepare a
kind of bundle with that data and use the bundle v3 mechanism to send
it.
That may need the bundle v3 mechanism to be extended, but I don't see
in which cases it would not work.

>   2. A server pointing the client to another object store is actually
>      the least interesting bit of the protocol.
>
>      The more interesting cases (to me) are:
>
>        a. The receiving side of a connection (e.g., a fetch client)
>           somehow has out-of-band access to some objects. How does it
>           tell the other side "do not bother sending me these objects; I
>           can get them in another way"?

I don't see a difference with regular objects that the fetch client
already has. If it already has some regular objects, a way to tell the
server "don't bother sending me these objects" is useful already and
it should be possible to use it to tell the server that there is no
need to send some objects stored in the external ODB too.

Also something like this is needed for shallow clones and narrow clones anyway.

>        b. The receiving side of a connection has out-of-band access to
>           some objects. Some of these will be expensive to get (e.g.,
>           requiring a large download), and some may be fast (e.g.,
>           they've already been fetched to a local cache). How do we tell
>           the sending side not to assume we have cheap access to these
>           objects (e.g., for use as a delta base)?

I don't think we need to tell the sending side we have cheap access or
not to some objects.
If the objects are managed by the external ODB, it's the external ODB
on the server and on the client that will manage these objects. They
should not be used as delta bases.
Perhaps there is no mechanism to say that some objects (basically all
external ODB managed objects) should not be used as delta bases, but
that could be added.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html