On Thu, Nov 28, 2013 at 5:52 AM, Philip Oakley <philipoakley@xxxxxxx> wrote: > In the pack transfer protocol (Documentation\technical\pack-protocol.txt) > the negotiation for refs is discussed, but its unclear to me if the > negotiation explicitly navigates down into the trees and blobs of each > commit that need to go into the pack. > > From one perspective I can see that, in the main, it's only commit objects > that are being negotiated, and the DAG is used to imply which commit objects > are to be sent between the wants and haves end points, without need to > descend into their trees and blobs. The tags and the objects they point to > are explicitly given so are negotiated easily. > > The other view is that the negotiation should be listing every object of any > type between the wants and haves as part of the negotiation. I just couldn't > tell from the docs which assumption is appropriate. Is there any extra > clarifications on this? other object negotiation is inferred from commits because sending full listing is too much. If you say you have commit A, you imply you have everything from commit A down to the bottom. With this knowledge, when you want commit B, the sender only needs to send trees and objects that do not exist in commit A or any of its predecessors. Although to cut cost at the sender, we do something less than optimized (check out the edge concept in documents, or else in pack-objects.c). Pack bitmaps are supposed to provide cheap object traversal and make the transfered pack even smaller. > I ask as I was cogitating on options for a 'narrow' clone (to complement > shallow clones ;-) that could, say, in some way limit the size of blobs > downloaded, or the number of tree levels downloaded, or even path limiting. size limiting is easy because you don't need to traverse object dag at all. Inside pack-objects it calls rev-list to collect objects to be sent. You just filter by size at that phase. Support for raising or lowering size limit is also workable, just like how shallow deepen/shorten is done: you let the sender know you have size limit A, now you want to raise to B and the sender just collects extra objects in A..B range for all "have" refs. The problem is how to let the client know what objects are not sent due to the size limit, so it could set up refs/replace to stop the user from running into missing objects. If there are too many excluded objects, sending all those SHA-1 with pkt-line is inefficient. (path limit does not have problem, it can infer from the command line arguments most of the time). Maybe you could send this listing in binary format just before sending the pack. BTW another way to deal with large blobs in clone is git-annex. I was thinking the other day if we could sort of integrate it to git to provide smooth UI (the user does not have to type "git annex something", or at least not often). Of course git-annex is still optional and the UI integration is only activated via config key, after git-annex is installed. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html