On Fri, Jun 24, 2016 at 02:50:10PM -0500, Bob Peterson wrote: > This patch adds filesystem-specific callbacks for shrinking the > inode cache, prune_icache_sb. This is provided for filesystems in > which the default VFS prune_icache_sb needs to be delayed due to > filesystem locking requirements not met by vfs. > > Signed-off-by: Bob Peterson <rpeterso@xxxxxxxxxx> > --- > Documentation/filesystems/vfs.txt | 15 +++++++++++++++ > fs/inode.c | 1 + > fs/super.c | 5 ++++- > include/linux/fs.h | 3 +++ > 4 files changed, 23 insertions(+), 1 deletion(-) > > diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt > index c61a223..7cb4c5c 100644 > --- a/Documentation/filesystems/vfs.txt > +++ b/Documentation/filesystems/vfs.txt > @@ -230,6 +230,7 @@ struct super_operations { > ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); > int (*nr_cached_objects)(struct super_block *); > void (*free_cached_objects)(struct super_block *, int); > + long (*prune_icache_sb)(struct super_block *, struct shrink_control *); > }; IIRC, I originally proposed this before adding the nr_cached_objects/free_cached_objects interface because it allowed filesystems direct control over inode reclaim. I don't recall the reasons for it being rejected now - I think the main one was that the inode life cycle was owned by the VFS, not the filesystem, and that it woul dbe too easy for people to implement their own shrinkers and get it all wrong and hence screw over the entire system.... > All methods are called without any locks being held, unless otherwise > @@ -319,6 +320,20 @@ or bottom half). > implementations will cause holdoff problems due to large scan batch > sizes. > > + prune_icache_sb: called by the sb cache shrinking function for the file > + filesystem to reduce the number of inodes from slab. This is done to > + accomodate file systems that may not be able to immediately remove > + inodes from cache, but must queue them to be removed ASAP. > + > + This can happen in GFS2, for example, where evicting an inode > + may require an inter-node lock (glock) which makes a call into DLM > + (distributed lock manager), which may block for any number of reasons. > + For example, it may block because a customer node needs to be fenced, > + so the lock cannot be granted until the fencing is complete. > + The fencing, in turn, may be blocked for other reasons, such as > + memory allocations that caused the shrinker to be called in the first > + place. Optional. If not set, the default vfs prune_icache_sb is called. So why don't you simply do what XFS does? i.e. when ->evict/destroy is called on an inode, you clean up the VFS portion of the inode but do not destroy/free it - you simply queue it to an internal list and then do the cleanup/freeing in your own time? i.e. why do you need a special callout just to defer freeing to another thread when we already have hooks than enable you to do this? Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html