[PATCH 0/2] Per superblock inode reclaim

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



In July 2011, near patch b0d40c92adafde7c2d81203ce7c1c69275f41140,
Dave Chinner introduced the concept of per-superblock shrinkers.
However, vfs still uses its own function, prune_icache_sb, which
gives almost no control to the underlying file system over the
inode eviction.

The trouble is, some file systems (GFS2 in particular) need more
control over the eviction of inodes. When you evict an inode in GFS2,
it may need to do inter-node locking and unlocking, for which it
calls the distributed lock manager, DLM. But DLM may not be able to
service the request immediately, due to unforeseen circumstances.

For example, if a cluster node has failed, DLM may need to wait for
that node to be fenced which, in turn, may block on a response from the
user space fence daemon. That, in turn, may block waiting on memory
allocation which caused the shrinker to be called in the first place.
Thus, GFS2 sits forever in an unrecoverable deadlock.

This is not unique to GFS2: OCFS2 and NFS probably have the same issue.

The first patch set extends Dave Chinner's idea further, adding a
filesystem-specific prune_icache_sb function and exporting the
existing one.

The second patch changes GFS2 to provide a prune_icache_sb function,
which really just tells its running daemon to shrink the inode slab
when it can, and by how many items. It does so by calling the vfs
prune_icache_sb to do the majority of the work. Deferring the
shrinker to the GFS2 daemon allows the vfs shrinker to proceed
without blocking on GFS2. The process that caused the shrink (in the
example above, the fence daemon) may then proceed without blocking.
The daemon can then safely evict the inodes, calling the DLM without
the risk of deadlock.

Signed-off-by: Bob Peterson <rpeterso@xxxxxxxxxx>
---
Bob Peterson (2):
  vfs: Add hooks for filesystem-specific prune_icache_sb
  GFS2: Add a gfs2-specific prune_icache_sb

 Documentation/filesystems/vfs.txt | 15 +++++++++++++++
 fs/gfs2/incore.h                  |  2 ++
 fs/gfs2/ops_fstype.c              |  1 +
 fs/gfs2/quota.c                   | 25 +++++++++++++++++++++++++
 fs/gfs2/super.c                   | 13 +++++++++++++
 fs/inode.c                        |  1 +
 fs/super.c                        |  5 ++++-
 include/linux/fs.h                |  3 +++
 8 files changed, 64 insertions(+), 1 deletion(-)

-- 
2.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux