On Thu, Apr 11, 2019 at 05:47:46AM +0100, Al Viro wrote: > On Thu, Apr 11, 2019 at 12:48:21PM +1000, Tobin C. Harding wrote: > > > Oh, so putting entries on a shrink list is enough to pin them? > > Not exactly pin, but __dentry_kill() has this: > if (dentry->d_flags & DCACHE_SHRINK_LIST) { > dentry->d_flags |= DCACHE_MAY_FREE; > can_free = false; > } > spin_unlock(&dentry->d_lock); > if (likely(can_free)) > dentry_free(dentry); > and shrink_dentry_list() - this: > if (dentry->d_lockref.count < 0) > can_free = dentry->d_flags & DCACHE_MAY_FREE; > spin_unlock(&dentry->d_lock); > if (can_free) > dentry_free(dentry); > continue; > so if dentry destruction comes before we get around to > shrink_dentry_list(), it'll stop short of dentry_free() and mark it for > shrink_dentry_list() to do just dentry_free(); if it overlaps with > shrink_dentry_list(), but doesn't progress all the way to freeing, > we will > * have dentry removed from shrink list > * notice the negative ->d_count (i.e. that it has already reached > __dentry_kill()) > * see that __dentry_kill() is not through with tearing the sucker > apart (no DCACHE_MAY_FREE set) > ... and just leave it alone, letting __dentry_kill() do the rest of its > thing - it's already off the shrink list, so __dentry_kill() will do > everything, including dentry_free(). > > The reason for that dance is the locking - shrink list belongs to whoever > has set it up and nobody else is modifying it. So __dentry_kill() doesn't > even try to remove the victim from there; it does all the teardown > (detaches from inode, unhashes, etc.) and leaves removal from the shrink > list and actual freeing to the owner of shrink list. That way we don't > have to protect all shrink lists a single lock (contention on it would > be painful) and we don't have to play with per-shrink-list locks and > all the attendant headaches (those lists usually live on stack frame > of some function, so just having the lock next to the list_head would > do us no good, etc.). Much easier to have the shrink_dentry_list() > do all the manipulations... > > The bottom line is, once it's on a shrink list, it'll stay there > until shrink_dentry_list(). It may get extra references after > being inserted there (e.g. be found by hash lookup), it may drop > those, whatever - it won't get freed until we run shrink_dentry_list(). > If it ends up with extra references, no problem - shrink_dentry_list() > will just kick it off the shrink list and leave it alone. > > Note, BTW, that umount coming between isolate and drop is not a problem; > it call shrink_dcache_parent() on the root. And if shrink_dcache_parent() > finds something on (another) shrink list, it won't put it to the shrink > list of its own, but it will make note of that and repeat the scan in > such case. So if we find something with zero refcount and not on > shrink list, we can move it to our shrink list and be sure that its > superblock won't go away under us... Man, that was good to read. Thanks for taking the time to write this. Tobin