Re: Re-Transmission of blobs?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 24, 2013 at 03:36:13AM -0400, Jeff King wrote:
> On Fri, Sep 20, 2013 at 11:27:15AM +0200, Josef Wolf wrote:

> > Even without asking, we can assume with great probability that
> > origin/somebranch is available at origin.
> Bear in mind that the transfer process does not know about
> cherry-picking at all.

It dosn't need to know.

> It only sees the other side's tips and traverses.

The sender side knows with high probability that origin/somebranch is avalable
at the receivig side (unless it was deleted). And since the file in question
is part of the tree at the tip of origin/somebranch, we can deduce that the
file is available on the other side (unless it was deleted).

> > And the file in question happens to reside in the tree at the very tip
> > of origin/somebranch, not in some of its ancestors. In this case,
> > there's no need to search the history at all. And even in this pretty
> > simple case, the algorithm seems to fail for some reason.
> 
> Correct. And in the current code, we should be looking at the tip tree
> for your case.  However, the usual reason to do so is to mark those
> objects as a "preferred base" in pack-objects for doing deltas. I wonder
> if we are not correctly noticing the case that an object is both
> requested to be sent and marked as a preferred base (in which case we
> should drop it from our sending list).

Further, it seems that the marking as preferred base had no effect, since the
delta should have been zero in this case. Or is this mechanism deactivated for
binary data (/dev/zero in this case)?

> If that's the problem, it should be easy to fix cheaply. It would not
> work in the general case, but it would for your specific example. But
> since it costs nothing, there's no reason not to.
> 
> I'll see if I can investigate using the example script you posted.

Thanks!

> I meant "we do the optimization during history traversal that avoids
> going into sub-trees we have already seen". We do _not_ do the full
> history traversal for a partial push.

OK. I see. Maybe a config option to request a full traversal would be a
reasonable compromise? That way CPU could be traded against bandwidth for
repositories that happen to have slow/unreliable/expensive connections.

> Yes, that would be nice. However, in the common cases it would make
> things much worse. A clone of linux.git has ~3.5M objects.

Of course, if there's nothing you can drop, any attempt to drop objects will
add to overhead. That's similar to compressing compressed files. This will
enlarge the original file. Would that be a reasonable argument to get rid of
all attempts to compress files?

-- 
Josef Wolf
jw@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]