Re: [Help]Cannot read symbolic link(Invalid argument) in overlayfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 29, 2019 at 10:02 AM koishi komeiji <maykagura@xxxxxxxxx> wrote:

>> I imagine that if you restart container or just drop caches problem
>> goes away?
>
> I try to drop caches, the problem goes away immediately.
> So I think the problem is about dentry and inode cache.
>
> # ls -l | grep invalid
> ls: cannot read symbolic link librmifm.so: Invalid argument
> ls: cannot read symbolic link libdrv_vlan.so: Invalid argument
> #
> # echo 2 > /proc/sys/vm/drop_caches
> #
> # ls -l | grep invalid
> #
>
>
>
>> Do you know if the symlink was just recently created or if it existed before
>> container has been started?
>
> The symlink in the image is ok. when the container start, the  symlink
> is decompressed from a cpio file.In this time ,it is still  ok. Just
> after some time, it break suddenly.
>
>

Koishi,

I have a theory, but it is not so easy to test.
I am not sure if you are able to compile and install a new overlayfs module.
Attached is un-tested patch to solve the speculated problem.

This problem should not exist in stable kernel >= v4.14, so if you
can update the host kernel that would be the best option for you.

The problem described in the patch was solved in upstream kernel by
commits:
09d8b586731b ovl: move __upperdentry to ovl_inode
31747eda41ef ovl: hash directory inodes for fsnotify

But those depend on many other changes, so cannot be easily
backported to kernel v4.9.

What I speculate that happens is:
1. Some ovl dentry holds a reference on upper dentry
2. upper dentry hold a ref on upper NON-symlink inode
3. overlay inode->i_private point to upper inode, but does not
    hold a refcount to upper inode
4. overlay inode is hashed by value of upper inode pointer
5. At some point, overlay dentry, upper dentry and upper inode
   are dropped, but overlay inode remains with elevated refcount
   (maybe by inotify) and hashed
6. A new lookup of the symlink allocates a new upper inode
    struct and reuses the address of NON-symlink free upper inode
7. ovl lookup finds the old ovl inode in cache by upper inode
    address, but the found overlay inode is not a symlink
8. The new ovl dentry is !d_is_symlink, although stat(2) will show
    i_mode from real inode that is a symlink
9. readlink(2) of ovl dentry will also fail with -EINVAL
10. open(2) on ovl dentry will not try to follow symlink and will
    eventually fail to open the symlink itself with -ELOOP

Thanks,
Amir.
From e163516be9f1b82f070b114b852a2d659c755fb2 Mon Sep 17 00:00:00 2001
From: Amir Goldstein <amir73il@xxxxxxxxx>
Date: Fri, 29 Mar 2019 14:16:42 +0300
Subject: [PATCH] ovl: grab a reference on real inode

The overlay inode->i_private pointer references the real inode.
Normally, the real inode refcount is held by the overlay real
dentry reference.

However, if overlay inode is pinned, for example by inotify watch
and caches are dropped, the overlay dentry goes away with the
elevated refcounts and then inode->i_private may be left with a
dangling referece.

If said overlay inode is also upper and hashed (by dead real inode
pointer) a new lookup with new real inode that has the same address
as old real inode may find the bad overlay inode in cache.

Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
---
 fs/overlayfs/inode.c |  5 ++++-
 fs/overlayfs/super.c | 12 +++++++++---
 2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 16f6db88c8e5..734230a5244e 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -416,7 +416,10 @@ static int ovl_inode_test(struct inode *inode, void *data)
 
 static int ovl_inode_set(struct inode *inode, void *data)
 {
-	inode->i_private = (void *) (((unsigned long) data) | OVL_ISUPPER_MASK);
+	struct inode *realinode = data;
+
+	inode->i_private = (void *) (((unsigned long) igrab(realinode)) |
+			   OVL_ISUPPER_MASK);
 	return 0;
 }
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e7c8ac41e288..75500928bba6 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -148,7 +148,7 @@ struct dentry *ovl_dentry_real(struct dentry *dentry)
 static void ovl_inode_init(struct inode *inode, struct inode *realinode,
 			   bool is_upper)
 {
-	WRITE_ONCE(inode->i_private, (unsigned long) realinode |
+	WRITE_ONCE(inode->i_private, (unsigned long) igrab(realinode) |
 		   (is_upper ? OVL_ISUPPER_MASK : 0));
 }
 
@@ -234,7 +234,7 @@ void ovl_inode_update(struct inode *inode, struct inode *upperinode)
 	WARN_ON(!upperinode);
 	WARN_ON(!inode_unhashed(inode));
 	WRITE_ONCE(inode->i_private,
-		   (unsigned long) upperinode | OVL_ISUPPER_MASK);
+		   (unsigned long) igrab(upperinode) | OVL_ISUPPER_MASK);
 	if (!S_ISDIR(upperinode->i_mode))
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 }
@@ -687,12 +687,18 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 	return 0;
 }
 
+static int ovl_drop_inode(struct inode *inode)
+{
+	iput(ovl_inode_real(inode));
+	return 1;
+}
+
 static const struct super_operations ovl_super_operations = {
 	.put_super	= ovl_put_super,
 	.statfs		= ovl_statfs,
 	.show_options	= ovl_show_options,
 	.remount_fs	= ovl_remount,
-	.drop_inode	= generic_delete_inode,
+	.drop_inode	= ovl_drop_inode,
 };
 
 enum {
-- 
2.7.4


[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux