David Turner <dturner@xxxxxxxxxxxxxxxx> writes: > The problem is not the name_hash, but with the dir_hash. The dir_hash > is only even populated on case-insensitive filesystems (which is why you > and Linus don't see this bug). The dir_hash is not designed to catch > d/f conflicts, but rather case conflicts (one side of a merge has > foo/bar; the other has FOO/bar). For each directory, it maintains a > pointer to a cache_entry. Then add_to_index uses that cache_entry to > rewrite incoming entries' parent directories to the expected case. > Weirdly, that part of add_to_index is, so far as I can tell, never > called during a merge. So where's are we using the freed value? > Probably in dir_entry_cmp, while adding another entry to the hashmap; > that's where our crash seems to have happened. And our failure depended > on the details of the hash function as well; if a certain collision did > not happen, we would not see it. > > There is, I think, another way to handle this: we could *copy* the path > into the struct dir_entry instead of pointing to a cache_entry. But > this seems like a bunch of extra work; the refcounting is simpler. This codepath is a mess. Whoever wrote hash_dir_entry() seems to have already known that the code is buggy by leaving this comment there: * Note that the cache_entry stored with the dir_entry merely * supplies the name of the directory (up to dir_entry.namelen). We * track the number of 'active' files in a directory in dir_entry.nr, * so we can tell if the directory is still relevant, e.g. for git * status. However, if cache_entries are removed, we cannot pinpoint * an exact cache_entry that's still active. It is very possible that * multiple dir_entries point to the same cache_entry. If you add "a/frotz" to the index, the code will somehow create a dir_entry to represent that the canonical case for directory "a/" is such, so that when you later try to add "A/nitfol", the code can grab the canonical path "a/" from dir_entry hash and turn it into "a/nitfol". That is what happens in add_to_index in read-cache.c But what happens if, after adding "a/frotz" and "a/nitfol" to the index like that, you remove "a/frotz"? I do not see any code that notices the poitner to "a/frotz" will become stale and replace it with a pointer to "a/nitfol" that is still in the system. The next time you try to add "a/xyzzy", find_dir_entry() will try to see if there is an existing entry that matches "a/xyzzy"'s directory, and one of the dir_entries has a stale pointer to ce for "a/frotz" that is already gone. I think the root cause of the bug is the external interface to the index_dir_exists() function. For the above processing, there is no reason for dir_entry() to hold a pointer to ce for "a/frotz" or "a/nitfol". All it needs to have is an embedded string in the structure itself, i.e. struct dir_entry { struct hashmap_entry ent; struct dir_entry *parent; int nr; unsigned int namelen; char name[FLEX_ARRAY]; }; The caller of index_dir_exists() in add_to_index() can be adjusted to work with a new function in name-hash.c that does this part: const char *startPtr = ce->name; const char *ptr = startPtr; while (*ptr) { while (*ptr && *ptr != '/') ++ptr; if (*ptr == '/') { struct cache_entry *foundce; ++ptr; foundce = index_dir_exists(istate, ce->name, ptr - ce->name - 1); if (foundce) { memcpy((void *)startPtr, foundce->name + (startPtr - ce->name), ptr - startPtr); startPtr = ptr; } } } Perhaps name it "adjust_dirname_case(istate, ce->name)" and have it do the whole while loop above, all inside name-hash.c. That would make this caller a lot easier to read and understand what is going on. The other one, directory_exists_in_index_icase() in dir.c, expects that a ce is returned, but the way it uses the returned value does not really require the function to return a ce. It does look at the ce->ce_mode but that is only because the way index_dir_exists() tells its caller if there is a directory (hence some files inside it) or if there is a submodule (hence there is nothing below it) by making a fallback call internally to index_file_exists() and returns the ce for gitlink only when (1) there is no directory in dir_hash and (2) file_hash as a submodule in that path. A more cleaner helper function to give what this caller really wants to know that name-hash.c can provide would be a function that takes pathname and len and returns an enum: "there is nothing there", "there is a directory", or "there is a submodule". For completeness of the API, even though this sole caller does not need it, we could throw in "there is a regular file" there, too, if we wanted to. If I were to clean this up, I would probably use the above updated definition of struct dir_entry and: * drop the fallback "return gitlink if there is one" from index_dir_exists(), and make index_dir_exists() return 0 (false) or 1 (true). * have directory_exists_in_index() first call the updated index_dir_exists() via the cache_dir_exists() convenience macro. If it returned true, it should return index_direcgtory. If it returned false, make a call to cache_file_exists() to see if there is a gitlink and rturn index_gitdir when that is the case. When all of the above fails, return index_nonexistent. That way, we do not have to worry about "this ce no longer exists in the main index but we need to keep it around", saving 8-bytes from each ce and refcounting mess in the code. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html