Apparently I did not save in my text editor (and didn't notice because the code comment was still valid syntactically, so everything still compiled). Here's a version with the updated and correctly formatted code comment. Jonathan Tan (3): index-pack --promisor: dedup before checking links index-pack --promisor: don't check blobs index-pack --promisor: also check commits' trees builtin/index-pack.c | 103 +++++++++++++++++++++++++++++-------------- 1 file changed, 71 insertions(+), 32 deletions(-) Range-diff against v2: 1: 7ae21c921f = 1: 7ae21c921f index-pack --promisor: dedup before checking links 2: 5a63c9a5ca ! 2: a1d2a20203 index-pack --promisor: don't check blobs @@ builtin/index-pack.c: static void record_outgoing_link(const struct object_id *o +static void maybe_record_name_entry(const struct name_entry *entry) +{ + /* -+ * The benefit of doing this is as above (fetch speedup), but the drawback -+is that if the packfile to be indexed references a local blob directly -+(that is, not through a local tree), that local blob is in danger of -+being garbage collected. Such a situation may arise if we push local -+commits, including one with a change to a blob in the root tree, -+and then the server incorporates them into its main branch through a -+"rebase" or "squash" merge strategy, and then we fetch the new main -+branch from the server. -+ -+This situation has not been observed yet - we have only noticed missing -+commits, not missing trees or blobs. (In fact, if it were believed that -+only missing commits are problematic, one could argue that we should -+also exclude trees during the outgoing link check; but it is safer to -+include them.) -+ -+Due to the rarity of the situation (it has not been observed to happen -+in real life), and because the "penalty" in such a situation is merely -+to refetch the missing blob when it's needed, the tradeoff seems -+worth it. ++ * Checking only trees here results in a significantly faster packfile ++ * indexing, but the drawback is that if the packfile to be indexed ++ * references a local blob only directly (that is, never through a ++ * local tree), that local blob is in danger of being garbage ++ * collected. Such a situation may arise if we push local commits, ++ * including one with a change to a blob in the root tree, and then the ++ * server incorporates them into its main branch through a "rebase" or ++ * "squash" merge strategy, and then we fetch the new main branch from ++ * the server. ++ * ++ * This situation has not been observed yet - we have only noticed ++ * missing commits, not missing trees or blobs. (In fact, if it were ++ * believed that only missing commits are problematic, one could argue ++ * that we should also exclude trees during the outgoing link check; ++ * but it is safer to include them.) ++ * ++ * Due to the rarity of the situation (it has not been observed to ++ * happen in real life), and because the "penalty" in such a situation ++ * is merely to refetch the missing blob when it's needed (and this ++ * happens only once - when refetched, the blob goes into a promisor ++ * pack, so it won't be GC-ed, the tradeoff seems worth it. + */ + if (S_ISDIR(entry->mode)) + record_outgoing_link(&entry->oid); 3: 8139325bf2 = 3: f9f9969a8f index-pack --promisor: also check commits' trees -- 2.47.0.338.g60cca15819-goog