[PATCH] VFS: Cut down inode->i_op->xyz accesses in path walking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



One of the biggest remaining unnecessary costs in path walking is the 
pointer chasing in inode operations.  We already avoided the dentry->d_op 
derferences with the DCACHE_OP_xyz flags, this just starts doing the same 
thing for the i_op->xyz cases.

Maybe-signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
---

This is a cut-down version of the earlier PATCH 1/2, forward-ported to the 
current vfs state.

It's tested, although I don't claim any serious amount of stressing of it. 
And I changed it so that those DCACHE_OP_{LOOKUP|FOLLOW_LINK} bits are 
only really valid when there is an inode at all. So the bits don't get 
invalidated when turning the dentry negative, they only get re-validated 
when the entry is instantiated again.

If something sets dentry->d_inode without going through __d_instantiate, 
that misses it. I'm looking at d_obtain_alias(), and wondering, for 
example.

This is a very minimal version of the previous patch: those *two* places 
where we now test the new DCACHE_OP_xyz bits are where we really see the 
dereferencing of the inode->i_op in the normal path walking. HOWEVER, the 
"inode_permission()" function also shows up pretty clearly on profiles, 
and the hottest part of that is the testing of "inode->i_op->permission".

However, with Al's other changes, the old special-case "exec_permission()" 
no longer exists. The whole reason it existed before was that the execute 
permission checking on directory entries is/was special from a performance 
standpoint. So it might make sense to resurrect it, and then add the
DCACHE_OP_PERMISSION case like in my older patch.

Anyway, it's not pretty, but that double indirect into a cold-cache
"inode->i_op->xyz" really does show up. So consider this an RFC. The "move 
ACL to generic" code was a serious patch, this is more a fodder for 
discussion.

                Linus

---
 fs/dcache.c            |   10 +++++++++-
 fs/namei.c             |    4 ++--
 include/linux/dcache.h |    6 ++++++
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index be18598c7fd7..3f9d1480f349 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1303,12 +1303,20 @@ EXPORT_SYMBOL(d_set_d_op);
 
 static void __d_instantiate(struct dentry *dentry, struct inode *inode)
 {
+	unsigned int flags;
+
 	spin_lock(&dentry->d_lock);
+	flags = dentry->d_flags & ~DCACHE_OP_INODE;
 	if (inode) {
 		if (unlikely(IS_AUTOMOUNT(inode)))
-			dentry->d_flags |= DCACHE_NEED_AUTOMOUNT;
+			flags |= DCACHE_NEED_AUTOMOUNT;
 		list_add(&dentry->d_alias, &inode->i_dentry);
+		if (inode->i_op->lookup)
+			flags |= DCACHE_OP_LOOKUP;
+		if (inode->i_op->follow_link)
+			flags |= DCACHE_OP_FOLLOW_LINK;
 	}
+	dentry->d_flags = flags;
 	dentry->d_inode = inode;
 	dentry_rcuwalk_barrier(dentry);
 	spin_unlock(&dentry->d_lock);
diff --git a/fs/namei.c b/fs/namei.c
index 120efc76d3d0..8c0955c3a4b2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1262,7 +1262,7 @@ static inline int walk_component(struct nameidata *nd, struct path *path,
 		terminate_walk(nd);
 		return -ENOENT;
 	}
-	if (unlikely(inode->i_op->follow_link) && follow) {
+	if (unlikely(path->dentry->d_flags & DCACHE_OP_FOLLOW_LINK) && follow) {
 		if (nd->flags & LOOKUP_RCU) {
 			if (unlikely(unlazy_walk(nd, path->dentry))) {
 				terminate_walk(nd);
@@ -1394,7 +1394,7 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 				return err;
 		}
 		err = -ENOTDIR; 
-		if (!nd->inode->i_op->lookup)
+		if (!(nd->path.dentry->d_flags & DCACHE_OP_LOOKUP))
 			break;
 		continue;
 		/* here ends the main loop */
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 3f22d8d6d8a3..c719566c4088 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -208,6 +208,12 @@ struct dentry_operations {
 #define DCACHE_CANT_MOUNT	0x0100
 #define DCACHE_GENOCIDE		0x0200
 
+/* Avoid dereferencing inode->i_op */
+#define DCACHE_OP_LOOKUP	0x0400
+#define DCACHE_OP_FOLLOW_LINK	0x0800
+#define DCACHE_OP_INODE	(DCACHE_OP_LOOKUP | DCACHE_OP_FOLLOW_LINK)
+
+/* Avoid dereferencing dentry->d_op */
 #define DCACHE_OP_HASH		0x1000
 #define DCACHE_OP_COMPARE	0x2000
 #define DCACHE_OP_REVALIDATE	0x4000
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux