+Duy Nguyen, who knows the split index better. On Thu, 2015-10-08 at 13:00 -0700, Junio C Hamano wrote: > David Turner <dturner@xxxxxxxxxxxxxxxx> writes: > > > From: Keith McGuigan <kmcguigan@xxxxxxxxxxx> > > > > During merges, we would previously free entries that we no longer need > > in the destination index. But those entries might also be stored in > > the dir_entry cache, and when a later call to add_to_index found them, > > they would be used after being freed. > > > > To prevent this, add a ref count for struct cache_entry. Whenever > > a cache entry is added to a data structure, the ref count is incremented; > > when it is removed from the data structure, it is decremented. When > > it hits zero, the cache_entry is freed. > > > > Signed-off-by: David Turner <dturner@xxxxxxxxxxxxxxxx> > > Signed-off-by: Keith McGuigan <kmcguigan@xxxxxxxxxxx> > > --- > > Thanks. > > > @@ -213,6 +214,32 @@ struct cache_entry { > > struct pathspec; > > > > /* > > + * Increment the cache_entry reference count. Should be called > > + * whenever a pointer to a cache_entry is retained in a data structure, > > + * thus marking it as alive. > > + */ > > +static inline void add_ce_ref(struct cache_entry *ce) > > +{ > > + assert(ce != NULL && ce->ref_count >= 0); > > + ++ce->ref_count; > > +} > > We tend to prefer post-increment when the distinction between pre- > or post- does not make any difference, which is the case here. Will fix. > > diff --git a/name-hash.c b/name-hash.c > > index 702cd05..f12c919 100644 > > --- a/name-hash.c > > +++ b/name-hash.c > > @@ -92,7 +93,9 @@ static void remove_dir_entry(struct index_state *istate, struct cache_entry *ce) > > struct dir_entry *dir = hash_dir_entry(istate, ce, ce_namelen(ce)); > > while (dir && !(--dir->nr)) { > > struct dir_entry *parent = dir->parent; > > - hashmap_remove(&istate->dir_hash, dir, NULL); > > + struct dir_entry *removed = hashmap_remove(&istate->dir_hash, dir, NULL); > > + assert(removed == dir); > > + drop_ce_ref(dir->ce); > > This is curious. In remove_name_hash() you do not have the > corresponding assert. Why is it necessary here (or is it > unnecessary over there)? It is unnecessary in any case: it's an assert rather than a check for an expected (or even unexpected) case. That just happens to be where Keith first managed to track down the use-after free, so that's where he happened to put the assert while trying to debug it. I'm in general in favor of more asserts rather than fewer, so I would prefer to keep it in. But I can remove it if you like. > > @@ -147,6 +151,7 @@ void remove_name_hash(struct index_state *istate, struct cache_entry *ce) > > return; > > ce->ce_flags &= ~CE_HASHED; > > hashmap_remove(&istate->name_hash, ce, ce); > > + drop_ce_ref(ce); > > > > if (ignore_case) > > remove_dir_entry(istate, ce); > > diff --git a/read-cache.c b/read-cache.c > > index 87204a5..442de34 100644 > > --- a/read-cache.c > > +++ b/read-cache.c > > @@ -53,6 +53,7 @@ static const char *alternate_index_output; > > static void set_index_entry(struct index_state *istate, int nr, struct cache_entry *ce) > > { > > istate->cache[nr] = ce; > > + add_ce_ref(ce); > > add_name_hash(istate, ce); > > } > > What happens to the existing entry that used to be istate->cache[nr], > which may or may not be 'ce' that is replacing it? > > It turns out that all three calling sites are safe, but it is not > immediately obvious why. Perhaps some comment in front of the > function is in order, to warn those who may have to add a new caller > or restructure the existing calling chain, that istate->cache[nr] is > expected not to hold a live reference when the function is called, > or something? Happy to add it if you want, but to me it was clear without because if there were a value in istate->cache[nr], that old value would be leaked. Let me know. > > @@ -314,8 +314,10 @@ void replace_index_entry_in_base(struct index_state *istate, > > istate->split_index->base && > > old->index <= istate->split_index->base->cache_nr) { > > new->index = old->index; > > - if (old != istate->split_index->base->cache[new->index - 1]) > > - free(istate->split_index->base->cache[new->index - 1]); > > + if (old != istate->split_index->base->cache[new->index - 1]) { > > + struct cache_entry *ce = istate->split_index->base->cache[new->index - 1]; > > + drop_ce_ref(ce); > > + } > > istate->split_index->base->cache[new->index - 1] = new; > > Does 'new' already have the right refcount at this point? I spoke to Keith, and he thinks we do need an add_ce_ref there. I'll fix that on the reroll. Duy, do you agree? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html