On Wed, Jun 03, 2020 at 01:51:51PM -0700, Jonathan Nieder wrote: > Taylor Blau wrote: > > > Ah, this only sort of has to do with the object cache. In > > 'parse_commit_buffer()' we stop parsing parents in the case that the > > repository is shallow (this goes back to 7f3140cd23 (git repack: keep > > commits hidden by a graft, 2009-07-23)). > > Ah, good analysis. (In fact, the behavior is older: it's from > 5da5c8f4cf4 (Teach parse_commit_buffer about grafting., 2005-07-30).) > So this is additional "cached" data that needs to be invalidated by > reset_repository_shallow. > > So the question is, what other information falls into that category? > > [...] > > --- a/shallow.c > > +++ b/shallow.c > > @@ -90,6 +90,9 @@ static void reset_repository_shallow(struct repository *r) > > { > > r->parsed_objects->is_shallow = -1; > > stat_validity_clear(r->parsed_objects->shallow_stat); > > + > > (nit: the above two lines wouldn't be needed if r->parsed_objects is > being thrown away.) Right, thanks. I don't think that it matters since you point out a legitimate issue with dangling references, but serves me right for working on this so late at night ;-). > > + parsed_object_pool_clear(r->parsed_objects); > > + r->parsed_objects = parsed_object_pool_new(); > > } > > Shallows don't affect the ref store. They only affect object walks. > So r->parsed_objects does seem like the only place that could be > affected. > > That said, with this change I'd worry about use-after-free from any > existing references to objects in the pool. > > Stepping back, what I think I would like to see is to *not* have > grafts and shallow state affect the in-memory persisted parsed > objects. Instead, act as an overlay in accessors that traverse over > them. > > Lacking that, I like the idea of a "dirty bit" that gets written as > soon as we have started lying in the parsed object pool. Something > like this. What do you think? > > diff --git i/commit-graph.c w/commit-graph.c > index 2ff042fbf4f..84b49ce903b 100644 > --- i/commit-graph.c > +++ w/commit-graph.c > @@ -149,7 +149,8 @@ static int commit_graph_compatible(struct repository *r) > } > > prepare_commit_graft(r); > - if (r->parsed_objects && r->parsed_objects->grafts_nr) > + if (r->parsed_objects && > + (r->parsed_objects->grafts_nr || r->parsed_objects->substituted_parent)) This is a little tricky. Why would we set substituted_parent without also incrementing grafts_nr? That seems like the real bug here: if we incremented grafts_nr, then we would return a non-zero value from 'commit_graph_compatible' and rightly stop even without this sticky-bit. I don't quite understand this myself. If it's an oversight, it's a remarkably long-lived one. Do you have a better sense of this? > return 0; > if (is_repository_shallow(r)) > return 0; > diff --git i/commit.c w/commit.c > index 87686a7055b..762f09e53ae 100644 > --- i/commit.c > +++ w/commit.c > @@ -423,6 +423,8 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b > pptr = &item->parents; > > graft = lookup_commit_graft(r, &item->object.oid); > + if (graft) > + r->parsed_objects->substituted_parent = 1; > while (bufptr + parent_entry_len < tail && !memcmp(bufptr, "parent ", 7)) { > struct commit *new_parent; > > @@ -447,6 +449,7 @@ int parse_commit_buffer(struct repository *r, struct commit *item, const void *b > if (graft) { > int i; > struct commit *new_parent; > + Nit: unnecessary whitespace change, but I doubt it really matters much. > for (i = 0; i < graft->nr_parent; i++) { > new_parent = lookup_commit(r, > &graft->parent[i]); > diff --git i/object.h w/object.h > index b22328b8383..db02fdcd6b2 100644 > --- i/object.h > +++ w/object.h > @@ -26,6 +26,7 @@ struct parsed_object_pool { > char *alternate_shallow_file; > > int commit_graft_prepared; > + int substituted_parent; > > struct buffer_slab *buffer_slab; > }; > > Thanks, > Jonathan Thanks, Taylor