On Mon, Apr 12, 2021 at 07:47:41PM -0400, Jeff King wrote: > The patch below drops the peak heap to 165MB. Still quite a bit more, > but I think it's a combination of delta-base cache (96MB) plus extra > structs for all the non-commit objects whose flags we marked. I think we can do even better than that, after looking into the "do we really need to parse the objects?" comment I left (spoiler: the answer is no, we do not need to, at least for that caller). Here are some cleaned-up patches that I think improve the situation quite a bit. This is just the low-hanging fruit from this part of the discussion; I'm sure there's more to do to make using partial clones pleasant. In particular: - this does nothing for the "oops, we turned all of the promisor objects loose and then deleted them" problem. Hopefully Rafael will produce a nice patch for that - In is_promisor_object(), we still call parse_object(), because it really does look at the contents. But doing so for blobs is wasteful (it's a lot of bytes we push through sha1, and we don't even look at them). It might be worth using oid_object_info() to avoid this. This introduces a little overhead, but I think would be a net win (it would be really nice if we could amortize the object lookup work; i.e., if there was a way to call oid_object_info_extended() and say "open the object and look at the type; only return the contents if it's a non-blob"). - I still think it's probably worth having a mode where we store the set of pointed-to objects we don't have, rather than parsing on the fly. This could easily be made optional (it's a good thing if you _have_ a lot of objects that point to a small or moderate number of missing objects, like a blob:limit or blob:none filter; it's a bad thing if you have very few objects but they point to a very large number). I didn't explore any of those here, and I don't plan to look into them anytime soon. I'm just documenting my findings for later. Anyway, here are the patches. [1/3]: is_promisor_object(): free tree buffer after parsing [2/3]: lookup_unknown_object(): take a repository argument [3/3]: revision: avoid parsing with --exclude-promisor-objects builtin/fsck.c | 2 +- builtin/pack-objects.c | 2 +- http-push.c | 2 +- object.c | 7 +++---- object.h | 2 +- packfile.c | 1 + refs.c | 2 +- revision.c | 2 +- t/helper/test-example-decorate.c | 6 +++--- t/perf/p5600-partial-clone.sh | 12 ++++++++++++ upload-pack.c | 2 +- walker.c | 2 +- 12 files changed, 27 insertions(+), 15 deletions(-) -Peff