Re: Resumable git clone?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 2, 2016 at 3:13 PM, Josh Triplett <josh@xxxxxxxxxxxxxxxx> wrote:
> On Wed, Mar 02, 2016 at 02:30:24AM +0000, Al Viro wrote:
>> On Tue, Mar 01, 2016 at 05:40:28PM -0800, Stefan Beller wrote:
>>
>> > So throwing away half finished stuff while keeping the front load?
>>
>> Throw away the object that got truncated and ones for which delta chain
>> doesn't resolve entirely in the transferred part.
>>
>> > > indexing the objects it
>> > > contains, and then re-running clone and not having to fetch those
>> > > objects.
>> >
>> > The pack is not deterministic for a given repository. When creating
>> > the pack, you may encounter races between threads, such that the order
>> > in a pack differs.
>>
>> FWIW, I wasn't proposing to recreate the remaining bits of that _pack_;
>> just do the normal pull with one addition: start with sending the list
>> of sha1 of objects you are about to send and let the recepient reply
>> with "I already have <set of sha1>, don't bother with those".  And exclude
>> those from the transfer.  Encoding for the set being available is an
>> interesting variable here - might be plain list of sha1, might be its
>> complement ("I want the following subset"), might be "145th to 1029th,
>> 1517th and 1890th to 1920th of the list you've sent"; which form ends
>> up more efficient needs to be found experimentally...
>
> As a simple proposal, the server could send the list of hashes (in
> approximately the same order it would send the pack), the client could
> send back a bitmap where '0' means "send it" and '1' means "got that one
> already", and the client could compress that bitmap.  That gives you the
> RLE and similar without having to write it yourself.  That might not be
> optimal, but it would likely set a high bar with minimal effort.

We have an implementation of EWAH bitmap compression, so compressing
is not a problem.

But I still don't see why it's more efficient to have the server send
the hash list to the client. Assume you need to transfer N objects.
That direction makes you always send N hashes. But if the client sends
the list of already fetched objects, M, then M <= N. And we won't need
to send the bitmap. What did I miss?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]