On Wed, Mar 02, 2016 at 02:37:53PM +0700, Duy Nguyen wrote: > On Wed, Mar 2, 2016 at 1:31 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes: > > > >> FWIW, I wasn't proposing to recreate the remaining bits of that _pack_; > >> just do the normal pull with one addition: start with sending the list > >> of sha1 of objects you are about to send and let the recepient reply > >> with "I already have <set of sha1>, don't bother with those". And exclude > >> those from the transfer. > > > > I did a quick-and-dirty unscientific experiment. > > > > I had a clone of Linus's repository that was about a week old, whose > > tip was at 4de8ebef (Merge tag 'trace-fixes-v4.5-rc5' of > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace, > > 2016-02-22). To bring it up to date (i.e. a pull about a week's > > worth of progress) to f691b77b (Merge branch 'for-linus' of > > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs, 2016-03-01): > > > > $ git rev-list --objects 4de8ebef..f691b77b1fc | wc -l > > 1396 > > $ git rev-parse 4de8ebef..f691b77b1fc | > > git pack-objects --revs --delta-base-offset --stdout | > > wc -c > > 2444127 > > > > So in order to salvage some transfer out of 2.4MB, the hypothetical > > Al protocol would first have the upload-pack give 20*1396 = 28kB > > It could be 10*1396 or less. If the server calculates the shortest > unambiguous SHA-1 length (quite cheap on fully packed repo) and sends > it to the client, the client can just sends short SHA-1 instead. It's > racy though because objects are being added to the server and abbrev > length may go up. But we can check ambiguity for all SHA-1 sent by > client and ask for resend for ambiguous ones. > > On my linux-2.6.git, 10 letters (so 5 bytes) are needed for > unambiguous short SHA-1. But we can even go optimistic and ask the > client for shorter SHA-1 with hope that resend won't be many. I don't think it's worth the trouble and ambiguity to send abbreviated object names over the wire. I think several simpler optimizations seem preferable, such as binary object names, and abbreviating complete object sets ("I have these commits/trees and everything they need recursively; I also have this stack of random objects."). That would work especially well for resumable pull, or for the case of optimizing pull during the merge window. - Josh Triplett -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html