Junio C Hamano <junkio@xxxxxxx> wrote: > Now if we fix dumb transport downloaders, then we could even > make a convention that the packs named pack-[0-9a-f]{40}.pack > are archive packs. And git-repack can even have a convention > that .git/objects/pack/pack-active.(pack|idx) is the active > pack. Seems reasonable. I take it you are proposing that a dumb transport always downloads pack-active.pack as pack-n{40}.pack where the dumb protocol downloader computed the correct pack name from its contents. Thus any remote pack downloaded over a dumb transport is automatically treated as a historical pack by the receiving repository. This will cause someone tracking a remote repository over a dumb transport to need to repack only a subset of their historical packs frequently into their own active.pack while leaving other historical packs untouched. But the more that I think about this neither solution (an active pack symref or pack-active.pack) really solves this. Being limited to just one active pack seems to be a problem with at least the dumb transports. I think that's why I preferred the size threshold idea. The active packs are cheap to repack because they are small. The larger packs aren't cheap to repack because they are large - and probably historical. What we are trying to get is fast repacks for the active objects while still getting full validation anytime we do a repack and (possibly) destroy the source. A size threshold does it. When Jon Smirl and I started kicking around the idea of a historical pack for Mozilla I was thinking of just storing a list of pack base names in ".git/objects/pack/historical". Packs listed there should generally be exempt from repacking. During an initial clone we'd need to deliver the contents of that file to the new repository, as if the source considers a pack historical its likely the new repository would want to as well. But now as I write this email I'm thinking that it may be just as easy to change the base name of the pack to "hist-n{40}" when we want to consider it historical. [snipped and re-ordered] > It first downloads the .idx files, so it can compute the > _right_ packname using the sorted object names recorded there Why trust the .idx? I've seen you post that the .idx is purely a local matter. The "smart" Git protocol only receives the .pack from the remote and computes the .idx locally or unpacks it to loose objects locally; why should a dumb transport trust the remote .idx? Oh, I know, when the .idx is >50 MiB, the .pack is >450 MiB, has 2 million objects and delta chains ~5000 long. Are we thinking that .idx files may need to have a slightly wider distribution than "local"? -- Shawn. -- VGER BF report: S 1 - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html