Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> writes: > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > First draft of design document for partial clone feature. > > Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> > --- Thanks. > +Non-Goals > +--------- > + > +Partial clone is independent of and not intended to conflict with > +shallow-clone, refspec, or limited-ref mechanisms since these all operate > +at the DAG level whereas partial clone and fetch works *within* the set > +of commits already chosen for download. It probably is not a huge deal (simply because it is about "Non-Goals") but I have no idea what "refspec" and "limited-ref mechanism" refer to in the above sentence, and I suspect many others share the same puzzlement. > +An object may be missing due to a partial clone or fetch, or missing due > +to repository corruption. To differentiate these cases, the local > +repository specially indicates packfiles obtained from the promisor > +remote. These "promisor packfiles" consist of a "<name>.promisor" file > +with arbitrary contents (like the "<name>.keep" files), in addition to > +their "<name>.pack" and "<name>.idx" files. (In the future, this ability > +may be extended to loose objects[a].) > + ... > +Foot Notes > +---------- > + > +[a] Remembering that loose objects are promisor objects is mainly > + important for trees, since they may refer to promisor blobs that > + the user does not have. We do not need to mark loose blobs as > + promisor because they do not refer to other objects. I fail to see any logical link between the "loose" and "tree". Putting it differently, I do not see why "tree" is so special. A promisor pack that contains a tree but lacks blobs the tree refers to would be sufficient to let us remember that these missing blobs are not corruption. A loose commit or a tag that is somehow marked as obtained from a promisor, if it can serve just like a commit or a tag in a promisor pack to promise its direct pointee, would equally be useful (if very inefficient). In any case, I suspect "since they may refer to promisor blobs" is a typo of "since they may refer to promised blobs". > +- Currently, dynamic object fetching invokes fetch-pack for each item > + because most algorithms stumble upon a missing object and need to have > + it resolved before continuing their work. This may incur significant > + overhead -- and multiple authentication requests -- if many objects are > + needed. > + > + We need to investigate use of a long-running process, such as proposed > + in [5,6] to reduce process startup and overhead costs. Also perhaps in some operations we can enumerate the objects we will need upfront and ask for them in one go (e.g. "git log -p A..B" may internally want to do "rev-list --objects A..B" to enumerate trees and blobs that we may lack upfront). I do not think having the other side guess is a good idea, though. > +- We currently only promisor packfiles. We need to add support for > + promisor loose objects as described earlier. The earlier description was not convincing enough to feel the need to me; at least not yet.