Re: git pack/unpack over bittorrent - works!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, 4 Sep 2010, Luke Kenneth Casson Leighton wrote:

> so, i believe that a much simpler algorithm is to follow nicolas' advice, and:
> 
> * split up a pack-index file by its fanout (1st byte of SHAs in the idx)
> * create SHA1s of the list of object-refs within an individual fanout
> * compare the per-fanout SHA1s remote and local
> * if same, deduce "oh look, we have that per-fanout list already"
> * grab the per-fanout object-ref list using standard p2p filesharing
> 
> in this way you'd end up breaking down e.g. 50mb of pack-index (for
> e.g. linux-2.6.git) into rouughly 200k chunks, and you'd exchange
> rouughly 50k of network traffic to find out that you'd got some of
> those fanout object-ref-lists already.  which is nice.

Scrap that idea -- this won't work.  The problem is that, by nature, 
SHA1 is totally random.  So if you have, say, 256 objects to transfer 
(and 256 objects is not that much) then, statistically, the probability 
that the SHA1s for those objects end up uniformly distributed across all 
the 256 fanouts is quite high.  the algorithm I mentioned completely 
breaks down in that case.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]