[PATCH 2/7] vfs: add inode iteration superblock method

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Dave Chinner <dchinner@xxxxxxxxxx>

Add a new superblock method for iterating all cached inodes in the
inode cache.

This will be used to replace the explicit sb->s_inodes iteration,
and the caller will supply a callback function and a private data
pointer that gets passed to the callback along with each inode that
is iterated.

There are two iteration functions provided. The first is the
interface that everyone should be using - it provides an valid,
unlocked and referenced inode that any inode operation (including
blocking operations) is allowed on. The iterator infrastructure is
responsible for lifecycle management, hence the subsystem callback
only needs to implement the operation it wants to perform on all
inodes.

The second iterator interface is the unsafe variant for internal VFS
use only. It simply iterates all VFS inodes without guaranteeing
any state or taking references. This iteration is done under a RCU
read lock to ensure that the VFS inode is not freed from under
the callback. If the operation wishes to block, it must drop the
RCU context after guaranteeing that the inode will not get freed.
This unsafe iteration mechanism is needed for operations that need
tight control over the state of the inodes they need to operate on.

This mechanism allows the existing sb->s_inodes iteration models
to be maintained, allowing a generic implementation for iterating
all cached inodes on the superblock to be provided.

Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx>
---
 fs/internal.h      |   2 +
 fs/super.c         | 105 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |  12 ++++++
 3 files changed, 119 insertions(+)

diff --git a/fs/internal.h b/fs/internal.h
index 37749b429e80..7039d13980c6 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -127,6 +127,8 @@ struct super_block *user_get_super(dev_t, bool excl);
 void put_super(struct super_block *sb);
 extern bool mount_capable(struct fs_context *);
 int sb_init_dio_done_wq(struct super_block *sb);
+void super_iter_inodes_unsafe(struct super_block *sb, ino_iter_fn iter_fn,
+		void *private_data);
 
 /*
  * Prepare superblock for changing its read-only state (i.e., either remount
diff --git a/fs/super.c b/fs/super.c
index a16e6a6342e0..20a9446d943a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -167,6 +167,111 @@ static void super_wake(struct super_block *sb, unsigned int flag)
 	wake_up_var(&sb->s_flags);
 }
 
+/**
+ * super_iter_inodes - iterate all the cached inodes on a superblock
+ * @sb: superblock to iterate
+ * @iter_fn: callback to run on every inode found.
+ *
+ * This function iterates all cached inodes on a superblock that are not in
+ * the process of being initialised or torn down. It will run @iter_fn() with
+ * a valid, referenced inode, so it is safe for the caller to do anything
+ * it wants with the inode except drop the reference the iterator holds.
+ *
+ */
+int super_iter_inodes(struct super_block *sb, ino_iter_fn iter_fn,
+		void *private_data, int flags)
+{
+	struct inode *inode, *old_inode = NULL;
+	int ret = 0;
+
+	spin_lock(&sb->s_inode_list_lock);
+	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+		spin_lock(&inode->i_lock);
+		if (inode->i_state & (I_NEW | I_FREEING | I_WILL_FREE)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+
+		/*
+		 * Skip over zero refcount inode if the caller only wants
+		 * referenced inodes to be iterated.
+		 */
+		if ((flags & INO_ITER_REFERENCED) &&
+		    !atomic_read(&inode->i_count)) {
+			spin_unlock(&inode->i_lock);
+			continue;
+		}
+
+		__iget(inode);
+		spin_unlock(&inode->i_lock);
+		spin_unlock(&sb->s_inode_list_lock);
+		iput(old_inode);
+
+		ret = iter_fn(inode, private_data);
+
+		old_inode = inode;
+		if (ret == INO_ITER_ABORT) {
+			ret = 0;
+			break;
+		}
+		if (ret < 0)
+			break;
+
+		cond_resched();
+		spin_lock(&sb->s_inode_list_lock);
+	}
+	spin_unlock(&sb->s_inode_list_lock);
+	iput(old_inode);
+	return ret;
+}
+
+/**
+ * super_iter_inodes_unsafe - unsafely iterate all the inodes on a superblock
+ * @sb: superblock to iterate
+ * @iter_fn: callback to run on every inode found.
+ *
+ * This is almost certainly not the function you want. It is for internal VFS
+ * operations only. Please use super_iter_inodes() instead. If you must use
+ * this function, please add a comment explaining why it is necessary and the
+ * locking that makes it safe to use this function.
+ *
+ * This function iterates all cached inodes on a superblock that are attached to
+ * the superblock. It will pass each inode to @iter_fn unlocked and without
+ * having performed any existences checks on it.
+
+ * @iter_fn must perform all necessary state checks on the inode itself to
+ * ensure safe operation. super_iter_inodes_unsafe() only guarantees that the
+ * inode exists and won't be freed whilst the callback is running.
+ *
+ * @iter_fn must not block. It is run in an atomic context that is not allowed
+ * to sleep to provide the inode existence guarantees. If the callback needs to
+ * do blocking operations it needs to track the inode itself and defer those
+ * operations until after the iteration completes.
+ *
+ * @iter_fn must provide conditional reschedule checks itself. If rescheduling
+ * or deferred processing is needed, it must return INO_ITER_ABORT to return to
+ * the high level function to perform those operations. It can then restart the
+ * iteration again. The high level code must provide forwards progress
+ * guarantees if they are necessary.
+ *
+ */
+void super_iter_inodes_unsafe(struct super_block *sb, ino_iter_fn iter_fn,
+		void *private_data)
+{
+	struct inode *inode;
+	int ret;
+
+	rcu_read_lock();
+	spin_lock(&sb->s_inode_list_lock);
+	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
+		ret = iter_fn(inode, private_data);
+		if (ret == INO_ITER_ABORT)
+			break;
+	}
+	spin_unlock(&sb->s_inode_list_lock);
+	rcu_read_unlock();
+}
+
 /*
  * One thing we have to be careful of with a per-sb shrinker is that we don't
  * drop the last active reference to the superblock from within the shrinker.
diff --git a/include/linux/fs.h b/include/linux/fs.h
index eae5b67e4a15..0a6a462c45ab 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2213,6 +2213,18 @@ enum freeze_holder {
 	FREEZE_MAY_NEST		= (1U << 2),
 };
 
+/* Inode iteration callback return values */
+#define INO_ITER_DONE		0
+#define INO_ITER_ABORT		1
+
+/* Inode iteration control flags */
+#define INO_ITER_REFERENCED	(1U << 0)
+#define INO_ITER_UNSAFE		(1U << 1)
+
+typedef int (*ino_iter_fn)(struct inode *inode, void *priv);
+int super_iter_inodes(struct super_block *sb, ino_iter_fn iter_fn,
+		void *private_data, int flags);
+
 struct super_operations {
    	struct inode *(*alloc_inode)(struct super_block *sb);
 	void (*destroy_inode)(struct inode *);
-- 
2.45.2





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux