Re: Resumable clone/Gittorrent (again) - stable packs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 7, 2011 at 08:09, Nicolas Pitre <nico@xxxxxxxxxxx> wrote:
> On Thu, 6 Jan 2011, Zenaan Harkness wrote:
>
>> Bittorrent requires some stability around torrent files.
>>
>> Can packs be generated deterministically?
>
> They _could_, but we do _not_ want to do that.
>
> The only thing which is stable in Git is the canonical representation of
> objects, and the objects they depend on, expressed by their SHA1
> signature.  Any BitTorrent-alike design for Git must be based on that
> property and not the packed representation of those objects which is not
> meant to be stable.
>
> If you don't want to design anything and simply reuse current BitTorrent
> codebase then simply create a Git bundle from some release version and
> seed that bundle for a sufficiently long period to be worth it.  Then
> falling back to git fetch in order to bring the repo up to date with the
> very latest commits should be small and quick.  When that clone gets too
> big then it's time to start seeding another more up-to-date bundle.

Thanks guys for the explanations.

So, we don't _want_ to generate packs deterministically.
BUT, we _can_ reliably unpack a pack (duh).

So if my configured "canonical upstream" decides on a particular
compression etc, I (my git client) doesn't care what has been chosen
by my upstream.

What is important for torrent-able packs though is stability over some
time period, no matter what the format.

There's been much talk of caching, invalidating of caches, overlapping
torrent-packs etc.

In every case, for torrents to work, the P2P'd files must have some
stability over some time period.
(If this assumption is incorrect, please clarify, not counting
every-file-is-a-torrent and every-commit-is-a-torrent.)

So, torrentable options:
- torrent per commit
- torrent per pack
- torrent per torrent-archive - new file format

Torrent per commit - too small, too many torrents; we need larger
p2p-able sizes in general.

Torrent per pack - packs non-deterministically created, both between
hosts and even intra-host (libz upgrade, nr_threads change, git pack
algorithm optimization).

A new torrent format, if "close enough" to current git pack
performance (cpu load, threadability, size) is potential for new
version of git pack file format - we don't want to store two sets of
pack files on disk, if sensible to not do so; unlikely to happen - I
can't conceive that a torrentable format would be anything but worse
than pack files and therefore would be rejected from git master.

Can we can relax the perceived requirement to deterministically create
pack files?
Well, over what time period are pack files stable in a particular git?
Over what time period do we require stable files for torrenting?

Can we simply configure our local git to keep specified pack files for
specified time period?
And use those for torrent-packs?
Perhaps the torrent file could have a UseBy date?

Zen
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]