Re: inotify to minimize stat() calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 13.02.2013 19:18, schrieb Jeff King:
> Moreover, looking at it again, I
> don't think my patch produces the right behavior: we have a single
> dir_next pointer, even though the same ce_entry may appear under many
> directory hashes. So the cache_entries that has to "dir/foo/" and those
> that hash to "dir/bar/" may get confused, because they will also both be
> found under "dir/", and both try to create a linked list from the
> dir_next pointer.
> 

Indeed. In the worst case, this causes an endless loop if ce->dir_next == ce
---8<---
mkdir V
mkdir V/XQANY
mkdir WURZAUP
touch V/XQANY/test
git init
git config core.ignorecase true
git add .
git status
---8<---
Note: "V/", "V/XQANY/" and "WURZAUP/" all have the same hash_name. Although I found those strange values by brute force, hash collisions in 32 bit values are not that uncommon in real life :-)

> Looking at Karsten's patch, it seems like it will not add a cache entry
> if there is one of the same name. But I'm not sure if that is right, as
> the old one might be CE_UNHASHED (or it might get removed later). You
> actually want to be able to find each cache_entry that has a file under
> the directory at the hash of that directory, so you can make sure it is
> still valid.
> 

Yes, the patch was just to show potential performance savings, I didn't consider CE_UNHASHED at all.

> I think the best way forward is to actually create a separate hash table
> for the directory lookups. I note that we only care about these entries
> in directory_exists_in_index_icase, which is really about whether
> something is there, versus what exactly is there. So could we maybe get
> by with a separate hash table that stores a count of entries at each
> directory, and increment/decrement the count when we add/remove entries?
> 

Alternatively, we could simply create normal cache_entries for the directories that are linked via ce->next, but have a trailing '/' in their name?

Reference counting sounds good to me, at least better than allocating directory entries per cache entry * parent dirs.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]