In July 2011, near patch b0d40c92adafde7c2d81203ce7c1c69275f41140, Dave Chinner introduced the concept of per-superblock shrinkers. However, vfs still uses its own function, prune_icache_sb, which gives almost no control to the underlying file system over the inode eviction. The trouble is, some file systems (GFS2 in particular) need more control over the eviction of inodes. When you evict an inode in GFS2, it may need to do inter-node locking and unlocking, for which it calls the distributed lock manager, DLM. But DLM may not be able to service the request immediately, due to unforeseen circumstances. For example, if a cluster node has failed, DLM may need to wait for that node to be fenced which, in turn, may block on a response from the user space fence daemon. That, in turn, may block waiting on memory allocation which caused the shrinker to be called in the first place. Thus, GFS2 sits forever in an unrecoverable deadlock. This is not unique to GFS2: OCFS2 and NFS probably have the same issue. The first patch set extends Dave Chinner's idea further, adding a filesystem-specific prune_icache_sb function and exporting the existing one. The second patch changes GFS2 to provide a prune_icache_sb function, which really just tells its running daemon to shrink the inode slab when it can, and by how many items. It does so by calling the vfs prune_icache_sb to do the majority of the work. Deferring the shrinker to the GFS2 daemon allows the vfs shrinker to proceed without blocking on GFS2. The process that caused the shrink (in the example above, the fence daemon) may then proceed without blocking. The daemon can then safely evict the inodes, calling the DLM without the risk of deadlock. Signed-off-by: Bob Peterson <rpeterso@xxxxxxxxxx> --- Bob Peterson (2): vfs: Add hooks for filesystem-specific prune_icache_sb GFS2: Add a gfs2-specific prune_icache_sb Documentation/filesystems/vfs.txt | 15 +++++++++++++++ fs/gfs2/incore.h | 2 ++ fs/gfs2/ops_fstype.c | 1 + fs/gfs2/quota.c | 25 +++++++++++++++++++++++++ fs/gfs2/super.c | 13 +++++++++++++ fs/inode.c | 1 + fs/super.c | 5 ++++- include/linux/fs.h | 3 +++ 8 files changed, 64 insertions(+), 1 deletion(-) -- 2.5.5 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html