Re: Resumable git clone?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Josh Triplett <josh@xxxxxxxxxxxxxxxx> writes:

> I don't think it's worth the trouble and ambiguity to send abbreviated
> object names over the wire.  

Yup.  My unscientific experiment was to show that the list would be
far smaller than the actual transfer and between full binary and
full textual object name representations there would not be much
meaningful difference--you seem to have a better design sense to
grasp that point ;-)

> I think several simpler optimizations seem
> preferable, such as binary object names, and abbreviating complete
> object sets ("I have these commits/trees and everything they need
> recursively; I also have this stack of random objects.").

Given the way pack stream is organized (i.e. commits first and then
trees and blobs that belong to the same delta chain together), and
our assumed goal being to salvage objects from an interrupted
transfer of a packfile, you are unlikely to ever see "I have these
commits/trees and everything they need" that are salvaged from such
a failed transfer.  So I doubt such an optimization is worth doing.

Besides it is very expensive to compute (the computation is done on
the client side, so the cycles burned and the time the user has to
wait is of much less concern, though); you'd essentially be doing
"git fsck" to find the "dangling" objects.

The list of what would be transferred needs to come in full from the
server end, as the list names objects that the receiving end may not
have seen, but the response by the client could be encoded much
tightly.  For the full list of N objects from the server, we can
think of your response to be a bitstream of N bits, each on-bit in
which signals an unwanted object in the list.  You can optimize this
transfer by RLE compressing the bitstream, for example.

As git-over-HTTP is stateless, however, you cannot assume that the
server side remembers what it sent to the client (instead, the
client side needs to re-post what it heard from the server in the
previous exchange to allow the server side to use it after
validating).  So "objects at these indices in your list" kind of
optimization may not work very well in that environment.  I'd
imagine that an exchange of "Here are the list of objects", "Give me
these objects" done naively in full 40-hex object names would work
OK there, though.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]