Re: [PATCH v3] merge: fix cache_entry use-after-free

Junio C Hamano <gitster@xxxxxxxxx> · Fri, 16 Oct 2015 09:04:01 -0700

David Turner <dturner@xxxxxxxxxxxxxxxx> writes:

> We also do dozens or hundreds of merges per day and only saw this quite
> rarely.  Interestingly, we could only trigger it with an alternate
> hashing function for git's hashmap implementation, and only once a
> certain bug in that implementation was *fixed*.  But that was just a
> trigger; it was certainly not the cause.  The bug would, in general,
> have caused more hash collisions due to worse mixing, but I believe it
> is possible that some particular collision would have been present in
> the non-bugged version of the code but not in the bugged version.
>
> It may have also had something to do with a case-insensitive filesystem;
> we never saw anyone reproduce it on anything but OS X, and even there it
> was quite difficult to reproduce.
>
> In short, I don't think we know why the bug was not seen widely.

It has been a long time since I looked at name-hash.c and I am fuzzy
on what the four functions (cache|index)_(file|dir)_exists are meant
for, but I have this feeling that the original premise of the patch,
"we may free a ce because we no longer use it in the index, but it
may still need to keep a reference to it in name-hash" may be wrong
in the first place.  The proposed "fix" conceptually feels wrong.

The whole point of the name-hash is so that we can detect collisions
in two names, one of which wants to have a file in one place while
the other wants to have a directory, at the same path in the index.
The pathnames and cache entries registered in the name-hash have to
be the ones that are relevant to the index in quesition.  If a new
ce will be added to the index, the name-hash will have to know about
its path (and that is what CE_HASHED bit is about).  On the other
hand, if you are going to remove an existing ce from the index, its
sub-paths should no longer prevent other cache entries to be there.

E.g. if you have "a/b", it must prevent "A" from entering the index
and the name-hash helps you to do so; when you remove "a/b", then
name-hash must now allow "A" to enter the index.  So "a/b" must be
removed from the name-hash by calling remove_name_hash() and normal
codepaths indeed do so.

I do not doubt the existence of "use-after-free bug" you observed,
but I am not convinced that refcounting is "fixing" the problem; it
smells like papering over a different bug that is the root cause of
the use-after-free.

For example, if we forget to "unhash" a ce from name-hash when we
remove a ce from the index (or after we "hashed" it, expecting to
add it to the index, but in the end decided not to add to the index,
perhaps), we would see a now-freed ce still in the name-hash.
Checking a path against the name-hash in such a state would have to
use the ce->name from that stale ce, which is a use-after-free bug.

In such a situation, isn't the real cause of the bug the fact that
the stale ce that is no longer relevant to the true index contents
still in name-hash?  The refcounting does not fix the fact that a
ce->name of a stale ce that is no longer relevant being used for D/F
collision checking.

I am not saying that I found such a codepath that forgets to unhash,
but from the overall design and purpose of the name-hash subsystem,
anything that deliberately _allows_ a stale ce that does not exist
in the index in there smells like a workaround going in a wrong
direction.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html