Re: Huge win, compressing a window of delta runs as a unit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/17/06, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
Hi,

On Thu, 17 Aug 2006, Jon Smirl wrote:

> On 8/17/06, Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
> > At least, the delta-chains should be limited by size (_not_ by number of
> > deltas: you can have huge deltas, and if you have to unpack 5 huge deltas
> > before getting to the huge delta you really need, it takes really long).
>
> This is not an obvious conclusion.

The big win is bought by having _one_ zlib stream for multiple deltas,
right?

So, everytime you want to access the _last_ delta in the chain, you unpack
_all_ of them. This quite obvious conclusion is obviously your reason to
propose two packs, one for the archive of "old" objects, and one for the
"new" objects.

Do some measurements, the IO vs CPU time trade off is way in favor of
eliminating the IO. It really doesn't take very much CPU to unpack the
delta chain.

The two pack proprosal was not about reducing the delta chain length;
they are reverse deltas, the newest version is always at the front.
Two packs are used to avoid repacking the 280MB pack when you do a
repack command. It takes 2-3 hours to repack 280MB. Even if if you
just copy the old pack to the new it take 30 minutes to do all of the
IO.

> As for public servers there is an immediate win in shifting to the new
> pack format.  Three hour downloads vs one hour, plus the bandwidth
> bills. Are the tools smart enough to say this is a newer pack format,
> upgrade? It takes far less than two hours to upgrade your git install.

Have you thought about a non-upgraded client? The server has to repack in
that case, and if the client wants a clone, you might not even have the
time to kiss your server good-bye before it goes belly up.

Is there a pack format version number built into the server procotol?
If not there needs to be one. When there is a mismatch with the server
pack version the client should simply display an error requesting an
upgrade of the client software.

Git should be designed for forward evolution, not infinite backward
compatibility. It is easy to upgrade your client to support the new
protocol. The protocol just needs to ensure that the client reliably
gets an error about the need to upgrade.

Forward evolution implies that a client is able to work with older
servers, but not the inverse, that new servers have to work with old
clients.

There is an obvious choice here as to how fast people would upgrade their
servers.

Ciao,
Dscho




--
Jon Smirl
jonsmirl@xxxxxxxxx
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]