Re: inotify to minimize stat() calls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 13, 2013 at 09:25:59PM +0100, Karsten Blees wrote:

> Am 13.02.2013 19:18, schrieb Jeff King:
> > Moreover, looking at it again, I
> > don't think my patch produces the right behavior: we have a single
> > dir_next pointer, even though the same ce_entry may appear under many
> > directory hashes. So the cache_entries that has to "dir/foo/" and those
> > that hash to "dir/bar/" may get confused, because they will also both be
> > found under "dir/", and both try to create a linked list from the
> > dir_next pointer.
> > 
> 
> Indeed. In the worst case, this causes an endless loop if ce->dir_next == ce
> ---8<---
> mkdir V
> mkdir V/XQANY
> mkdir WURZAUP
> touch V/XQANY/test
> git init
> git config core.ignorecase true
> git add .
> git status
> ---8<---

Great, thanks for the test case. I can trivially replicate the endless
loop. The patch I sent earlier fixes that. So it's at least a step in
the (possible) right direction. I'm slightly concerned that there is
some other case that is expecting the directories in the main hash, but
I think I got them all.

> Note: "V/", "V/XQANY/" and "WURZAUP/" all have the same hash_name.
> Although I found those strange values by brute force, hash collisions
> in 32 bit values are not that uncommon in real life :-)

Cute. :)

> Alternatively, we could simply create normal cache_entries for the
> directories that are linked via ce->next, but have a trailing '/' in
> their name?
>
> Reference counting sounds good to me, at least better than allocating
> directory entries per cache entry * parent dirs.

I think that is more or less what my patch does, but it splits the
ref-counted fake cache_entries out into a separate hash of "struct
dir_hash_entry" rather than storing it in the regular hash. Which IMHO
is a bit cleaner for two reasons:

  1. You do not have to pay the memory price of storing fake
     cache_entries (the name+refcount struct for each directory is much
     smaller than a real cache_entry).

  2. It makes the code a bit simpler, as you do not have to do any
     "check for trailing /" magic on the result of index_name_exists to
     determine if it is a "real" name or just a fake dir entry.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]