On Mon, 30 Dec 2013 21:45:17 +0800 Li Wang <liwang@xxxxxxxxxxxxxxx> wrote: > Analogous to shrink_dcache_parent except that it collects inodes. > It is not very appropriate to be put in dcache.c, but d_walk can only > be invoked from here. Please cc Dave Chinner on future revisions. He be da man. The overall intent of the patchset seems reasonable and I agree that it can't be efficiently done from userspace with the current kernel API. We *could* do it from userspace by providing facilities for userspace to query the VFS caches: "is this pathname in the dentry cache" and "is this inode in the inode cache". > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1318,6 +1318,42 @@ void shrink_dcache_parent(struct dentry *parent) > } > EXPORT_SYMBOL(shrink_dcache_parent); > > +static enum d_walk_ret gather_inode(void *data, struct dentry *dentry) > +{ > + struct list_head *list = data; > + struct inode *inode = dentry->d_inode; > + > + if ((inode == NULL) || ((!inode_owner_or_capable(inode)) && > + (!capable(CAP_SYS_ADMIN)))) > + goto out; > + spin_lock(&inode->i_lock); > + if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || It's unclear what rationale lies behind this particular group of tests. > + (inode->i_mapping->nrpages == 0) || > + (!list_empty(&inode->i_lru))) { arg, the "Inode locking rules" at the top of fs/inode.c needs a refresh, I suspect. It is too vague. Formally, inode->i_lru is protected by i_sb->s_inode_lru->node[nid].lock, not by ->i_lock. I guess you can just do a list_lru_add() and that will atomically add the inode to your local list_lru if ->i_lru wasn't being used for anything else. I *think* that your use of i_lock works OK, because code which fiddles with i_lru and s_inode_lru also takes i_lock. However we need to decide which is the preferred and official lock. ie: what is the design here?? However... most inodes will be on an LRU list, won't they? Doesn't this reuse of i_lru mean that many inodes will fail to be processed? If so, we might need to add a new list_head to the inode, which will be problematic. Aside: inode_lru_isolate() fiddles directly with inode->i_lru without taking i_sb->s_inode_lru->node[nid].lock. Why doesn't this make a concurrent s_inode_lru walker go oops?? Should we be using list_lru_del() in there? (which should have been called list_lru_del_init(), sigh). -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html