Re: Features from GitSurvey 2010

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 1, 2011 at 09:11, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote:
> On Tue, Feb 1, 2011 at 11:27 PM, Shawn Pearce <spearce@xxxxxxxxxxx> wrote:
>> On Tue, Feb 1, 2011 at 05:51, Jakub Narebski <jnareb@xxxxxxxxx> wrote:
>>>
>>>> > resumable clone/fetch (and other remote operations)
>>>>
>>>> Jakub Narebski seems to be interested in this and Nicolas Pitre has
>>>> given some good advice about it.  You can get something usable today
>>>> by putting up a git bundle for download over HTTP or rsync, so it is
>>>> possible that this just involves some UI (porcelain) and documentation
>>>> work to become standard practice.
>>>
>>> I wouldn't say that: it is Nicolas Pitre (IIRC) who was doing the work;
>>> I was only interested party posting comments, but no code.
>>>
>>> Again, this feature is not very easy to implement, and would require
>>> knowledge of git internals including "smart" git transport ("Pro Git"
>>> book can help there).
>>
>> I think Nico and I have mostly solved this with the pack caching idea.
>>  If we cache the pack file, we can resume anywhere in about 97% of the
>> transfer.  The first 3% cannot be resumed easily, its back to the old
>> "git cannot be resumed" issue.  Fixing that last 3% is incredibly
>
> I thought the cached pack contained anything and for initial clone, we
> simply send the pack. What is this 3%? Commit list? Initial commit?

Its the recent changes.  If the cached pack starts from the tip of
master, its probably 0%.  But if the repository owner pushes new
changes since the cached pack was created, these are sent as a thin
pack in front of the cached pack... and make up that ~3% guess.  For
linux-2.6 I tested a 2 week period when the merge window as open right
after a release, and the new delta was about 3% of the overall
repository size.

> Narrow/Subtree clone is still just an idea, but can pack cache support
> be made to resumable initial narrow clone too?

This would be very hard to do.  We could do cached packs for a popular
set of path specifications (e.g. Documentation/ if documentation only
editing is common), but once we start getting random requests for path
specifications that we cannot predict in advance and pre-pack we'd
have to fall back to the normal enumerate code path.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]