Re: [PATCH v2] fuse: fix deadlock between atomic O_TRUNC open() and page invalidations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 29, 2021 at 12:02:39PM +0800, Jiachen Zhang wrote:
> fuse_finish_open() will be called with FUSE_NOWRITE set in case of atomic
> O_TRUNC open(), so commit 76224355db75 ("fuse: truncate pagecache on
> atomic_o_trunc") replaced invalidate_inode_pages2() by truncate_pagecache()
> in such a case to avoid the A-A deadlock. However, we found another A-B-B-A
> deadlock related to the case above, which will cause the xfstests
> generic/464 testcase hung in our virtio-fs test environment.
> 
> For example, consider two processes concurrently open one same file, one
> with O_TRUNC and another without O_TRUNC. The deadlock case is described
> below, if open(O_TRUNC) is already set_nowrite(acquired A), and is trying
> to lock a page (acquiring B), open() could have held the page lock
> (acquired B), and waiting on the page writeback (acquiring A). This would
> lead to deadlocks.
> 
> open(O_TRUNC)
> ----------------------------------------------------------------
> fuse_open_common
>   inode_lock            [C acquire]
>   fuse_set_nowrite      [A acquire]
> 
>   fuse_finish_open
>     truncate_pagecache
>       lock_page         [B acquire]
>       truncate_inode_page
>       unlock_page       [B release]
> 
>   fuse_release_nowrite  [A release]
>   inode_unlock          [C release]
> ----------------------------------------------------------------
> 
> open()
> ----------------------------------------------------------------
> fuse_open_common
>   fuse_finish_open
>     invalidate_inode_pages2
>       lock_page         [B acquire]
> 	fuse_launder_page
> 	  fuse_wait_on_page_writeback [A acquire & release]
>       unlock_page       [B release]
> ----------------------------------------------------------------
> 
> Besides this case, all calls of invalidate_inode_pages2() and
> invalidate_inode_pages2_range() in fuse code also can deadlock with
> open(O_TRUNC). This commit tries to fix it by adding a new lock,
> atomic_o_trunc, to protect the areas with the A-B-B-A deadlock risk.

Thanks.  Can you please try the following patch?  Instead of introducing a new
lock it tries to fix this by moving the truncate_pagecache() out of the nowrite
protected section.

Thanks,
Miklos

---
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 656e921f3506..56f439719129 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -537,6 +537,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	struct fuse_file *ff;
 	void *security_ctx = NULL;
 	u32 security_ctxlen;
+	bool trunc = flags & O_TRUNC;
 
 	/* Userspace expects S_IFREG in create mode */
 	BUG_ON((mode & S_IFMT) != S_IFREG);
@@ -561,7 +562,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	inarg.mode = mode;
 	inarg.umask = current_umask();
 
-	if (fm->fc->handle_killpriv_v2 && (flags & O_TRUNC) &&
+	if (fm->fc->handle_killpriv_v2 && trunc &&
 	    !(flags & O_EXCL) && !capable(CAP_FSETID)) {
 		inarg.open_flags |= FUSE_OPEN_KILL_SUIDGID;
 	}
@@ -623,6 +624,8 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	} else {
 		file->private_data = ff;
 		fuse_finish_open(inode, file);
+		if (fm->fc->atomic_o_trunc && trunc)
+			truncate_pagecache(inode, 0);
 	}
 	return err;
 
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 829094451774..2e041708ef44 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -210,7 +210,6 @@ void fuse_finish_open(struct inode *inode, struct file *file)
 		fi->attr_version = atomic64_inc_return(&fc->attr_version);
 		i_size_write(inode, 0);
 		spin_unlock(&fi->lock);
-		truncate_pagecache(inode, 0);
 		file_update_time(file);
 		fuse_invalidate_attr_mask(inode, FUSE_STATX_MODSIZE);
 	} else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) {
@@ -239,30 +238,32 @@ int fuse_open_common(struct inode *inode, struct file *file, bool isdir)
 	if (err)
 		return err;
 
-	if (is_wb_truncate || dax_truncate) {
+	if (is_wb_truncate || dax_truncate)
 		inode_lock(inode);
-		fuse_set_nowrite(inode);
-	}
 
 	if (dax_truncate) {
 		filemap_invalidate_lock(inode->i_mapping);
 		err = fuse_dax_break_layouts(inode, 0, 0);
 		if (err)
-			goto out;
+			goto out_inode_unlock;
 	}
 
+	if (is_wb_truncate || dax_truncate)
+		fuse_set_nowrite(inode);
+
 	err = fuse_do_open(fm, get_node_id(inode), file, isdir);
 	if (!err)
 		fuse_finish_open(inode, file);
 
-out:
+	if (is_wb_truncate | dax_truncate)
+		fuse_release_nowrite(inode);
+	if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC))
+		truncate_pagecache(inode, 0);
 	if (dax_truncate)
 		filemap_invalidate_unlock(inode->i_mapping);
-
-	if (is_wb_truncate | dax_truncate) {
-		fuse_release_nowrite(inode);
+out_inode_unlock:
+	if (is_wb_truncate | dax_truncate)
 		inode_unlock(inode);
-	}
 
 	return err;
 }



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux