On 10/6/2021 5:29 AM, Phillip Wood via GitGitGadget wrote: > From: Phillip Wood <phillip.wood@xxxxxxxxxxxxx> > > In a sparse index it is possible for the tree that is being verified > to be freed while it is being verified. This happens when > index_name_pos() looks up a entry that is missing from the index and > that would be a descendant of a sparse entry. That triggers a call to > ensure_full_index() which frees the cache tree that is being verified. > Carrying on trying to verify the tree after this results in a > use-after-free bug. Instead restart the verification if a sparse index > is converted to a full index. This bug is triggered by a call to > reset_head() in "git rebase --apply". Thanks to René Scharfe for his > help analyzing the problem. Thank you for identifying an interesting case! I hadn't thought to change the mode from --merge to --apply. > In a sparse index it is possible for the tree that is being verified to > be freed while it is being verified. This is an RFC as I'm not familiar > with the cache tree code. I'm confused as to why this bug is triggered > by the sequence > > unpack_trees() > prime_cache_tree() > write_locked_index() > > but not > > unpack_trees() > write_locked_index() > > > as unpack_trees() appears to update the cache tree with > > if (!cache_tree_fully_valid(o->result.cache_tree)) > cache_tree_update(&o->result, > WRITE_TREE_SILENT | > WRITE_TREE_REPAIR); > > > and I don't understand why the cache tree from prime_cache_tree() > results in different behavior. It concerns me that this fix is hiding > another bug. prime_cache_tree() appears to clear the cache tree and start from scratch from a tree object instead of using the index. In particular, prime_cache_tree_rec() does not stop at the sparse-checkout cone, so the cache tree is the full size at that point. When the verify_one() method reaches these nodes that are outside of the cone, index_name_pos() triggers the index expansion in a way that the cache-tree that is restricted to the sparse-checkout cone does not. Hopefully that helps clear up _why_ this happens. There is a remaining issue that "git rebase --apply" will be a lot slower than "git rebase --merge" because of this construction of a cache-tree that is much larger than necessary. I will make note of this as a potential improvement for the future. > -static void verify_one(struct repository *r, > - struct index_state *istate, > - struct cache_tree *it, > - struct strbuf *path) > +static int verify_one(struct repository *r, > + struct index_state *istate, > + struct cache_tree *it, > + struct strbuf *path) > { > int i, pos, len = path->len; > struct strbuf tree_buf = STRBUF_INIT; > @@ -837,21 +837,30 @@ static void verify_one(struct repository *r, > > for (i = 0; i < it->subtree_nr; i++) { > strbuf_addf(path, "%s/", it->down[i]->name); > - verify_one(r, istate, it->down[i]->cache_tree, path); > + if (verify_one(r, istate, it->down[i]->cache_tree, path)) > + return 1; > strbuf_setlen(path, len); > } > > if (it->entry_count < 0 || > /* no verification on tests (t7003) that replace trees */ > lookup_replace_object(r, &it->oid) != &it->oid) > - return; > + return 0; > > if (path->len) { > + /* > + * If the index is sparse index_name_pos() may trigger > + * ensure_full_index() which will free the tree that is being > + * verified. > + */ > + int is_sparse = istate->sparse_index; > pos = index_name_pos(istate, path->buf, path->len); > + if (is_sparse && !istate->sparse_index) > + return 1; I think this guard is good to have, even if we fix prime_cache_tree() to avoid triggering expansion here in most cases. > if (pos >= 0) { > verify_one_sparse(r, istate, it, path, pos); > - return; > + return 0; > } > > pos = -pos - 1; > @@ -899,6 +908,7 @@ static void verify_one(struct repository *r, > oid_to_hex(&new_oid), oid_to_hex(&it->oid)); > strbuf_setlen(path, len); > strbuf_release(&tree_buf); > + return 0; > } > > void cache_tree_verify(struct repository *r, struct index_state *istate) > @@ -907,6 +917,9 @@ void cache_tree_verify(struct repository *r, struct index_state *istate) > > if (!istate->cache_tree) > return; > - verify_one(r, istate, istate->cache_tree, &path); > + if (verify_one(r, istate, istate->cache_tree, &path)) { > + strbuf_reset(&path); > + verify_one(r, istate, istate->cache_tree, &path); > + } And this limits us to doing at most two passes. Good. > test_expect_success 'merge, cherry-pick, and rebase' ' > init_repos && > > - for OPERATION in "merge -m merge" cherry-pick rebase > + for OPERATION in "merge -m merge" cherry-pick "rebase --apply" "rebase --merge" Thank you for the additional test! Thanks, -Stolee