On Fri, 22 Sep 2017 17:32:00 -0400 Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > I guess I'm afraid that the first call to is_promised() is going > cause a very long pause as it loads up a very large hash of objects. Yes, the first call will cause a long pause. (I think fsck and gc can tolerate this, but a better solution is appreciated.) > Perhaps you could augment the OID lookup to remember where the object > was found (essentially a .promisor bit set). Then you wouldn't need > to touch them all. Sorry - I don't understand this. Are you saying that missing promisor objects should go into the global object hashtable, so that we can set a flag on them? > > The oidset will deduplicate OIDs. > > Right, but you still have an entry for each object. For a repo the > size of Windows, you may have 25M+ objects your copy of the ODB. We have entries only for the "frontier" objects (the objects directly referenced by any promisor object). For the Windows repo, for example, I foresee that many of the blobs, trees, and commits will be "hiding" behind objects that the repository user did not download into their repo.