On Sat, Nov 27, 2010 at 2:52 AM, Jonathan Nieder <jrnieder@xxxxxxxxx> wrote: > Cory Fields wrote: >> On Fri, Nov 26, 2010 at 6:18 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > >>> True, but I suspect the above picture pretty much satisfies Cory's initial >>> wish, no? ÂYou can fetch recent 4'--5---6 history as if 4' were the root >>> commit, and if you fetched replacement that tells us to pretend that 4' >>> has 3 as its parent (and the history leading to 3), you will get a deeper >>> history. >> >> Yes, both of these can be accomplished. I've managed to get that part >> working, where a default clone pulls in half history, and fetching >> refs/replace gives you the rest. The only problem is that it requires a >> filter-branch before pushing. > > That's a one-time thing, not per-push, right? ÂA filter-branch would > indeed be needed to transform the history > > Â1 --- 2 --- 3 --- 4 --- 5' --- 6' > > into > > Â1 --- 2 --- 3 --- 4 > Â4' --- 5 --- 6 > > and that is unavoidable: the object names encode the entire list of > ancestors, you cannot push an object without its ancestors, etc. > But afterwards you can build on the history rooted at 4' and all > should be well, and you can use checkout --orphan to get a new > root when the current line of history is about to grow too long. > > In other words, the distinction between real history and fake history > is very relevant. ÂObject transport only cares about the real history > (barring bugs); if you want to tweak what objects get transferred, you > really need to rewrite the real history (or use --depth). > >> A shallow clone does not fit for us, because we want the default clone to >> only pull half. ÂHaving a public 1gb repository that will be cloned quite >> often is bound to make our host unhappy, so we're doing everything we can to >> get the size down. > > Why not publish a "git bundle" of the first 1gb using HTTP, > BitTorrent, or some other cache-friendly protocol and use a hook to > reject attempts to fetch too many objects at once from the host? > >> Also, maybe I haven't made this clear... the "real" commit IDs need to >> match the "fake" ones in order to prevent confusion. > > Not sure what this means. ÂBut commit IDs are defined based on > content, and for simplicity and sanity the object transport machinery > deliberately does not look beyond that. > > Regards, > Jonathan > I think a one-time filter-branch is going to be our best bet. I had assumed that this was the case, I just wanted reassurance that it was necessary. I have that now. Thanks to all for the responses. Martin: That sounds very interesting indeed. However, the docs make shallow clones sound scary. From the docs: "A shallow repository has a number of limitations (you cannot clone or fetch from it, nor push from nor into it)" I suppose these limitations would need to be addressed if/when looking into serverside depth defaults? Cory -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html