Patch "fuse: allow non-extending parallel direct writes on the same file" has been added to the 6.1-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    fuse: allow non-extending parallel direct writes on the same file

to the 6.1-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     fuse-allow-non-extending-parallel-direct-writes-on-t.patch
and it can be found in the queue-6.1 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 7d1ee672f2616d775939b066f475cd5168f32362
Author: Dharmendra Singh <dsingh@xxxxxxx>
Date:   Fri Jun 17 12:40:27 2022 +0530

    fuse: allow non-extending parallel direct writes on the same file
    
    [ Upstream commit 153524053bbb0d27bb2e0be36d1b46862e9ce74c ]
    
    In general, as of now, in FUSE, direct writes on the same file are
    serialized over inode lock i.e we hold inode lock for the full duration of
    the write request.  I could not find in fuse code and git history a comment
    which clearly explains why this exclusive lock is taken for direct writes.
    Following might be the reasons for acquiring an exclusive lock but not be
    limited to
    
     1) Our guess is some USER space fuse implementations might be relying on
        this lock for serialization.
    
     2) The lock protects against file read/write size races.
    
     3) Ruling out any issues arising from partial write failures.
    
    This patch relaxes the exclusive lock for direct non-extending writes only.
    File size extending writes might not need the lock either, but we are not
    entirely sure if there is a risk to introduce any kind of regression.
    Furthermore, benchmarking with fio does not show a difference between patch
    versions that take on file size extension a) an exclusive lock and b) a
    shared lock.
    
    A possible example of an issue with i_size extending writes are write error
    cases.  Some writes might succeed and others might fail for file system
    internal reasons - for example ENOSPACE.  With parallel file size extending
    writes it _might_ be difficult to revert the action of the failing write,
    especially to restore the right i_size.
    
    With these changes, we allow non-extending parallel direct writes on the
    same file with the help of a flag called FOPEN_PARALLEL_DIRECT_WRITES.  If
    this flag is set on the file (flag is passed from libfuse to fuse kernel as
    part of file open/create), we do not take exclusive lock anymore, but
    instead use a shared lock that allows non-extending writes to run in
    parallel.  FUSE implementations which rely on this inode lock for
    serialization can continue to do so and serialized direct writes are still
    the default.  Implementations that do not do write serialization need to be
    updated and need to set the FOPEN_PARALLEL_DIRECT_WRITES flag in their file
    open/create reply.
    
    On patch review there were concerns that network file systems (or vfs
    multiple mounts of the same file system) might have issues with parallel
    writes.  We believe this is not the case, as this is just a local lock,
    which network file systems could not rely on anyway.  I.e. this lock is
    just for local consistency.
    
    Signed-off-by: Dharmendra Singh <dsingh@xxxxxxx>
    Signed-off-by: Bernd Schubert <bschubert@xxxxxxx>
    Signed-off-by: Miklos Szeredi <mszeredi@xxxxxxxxxx>
    Stable-dep-of: 3002240d1649 ("fuse: fix memory leak in fuse_create_open")
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index e6ec4338a9c5..0df1311afb87 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1563,14 +1563,47 @@ static ssize_t fuse_direct_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	return res;
 }
 
+static bool fuse_direct_write_extending_i_size(struct kiocb *iocb,
+					       struct iov_iter *iter)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+
+	return iocb->ki_pos + iov_iter_count(iter) > i_size_read(inode);
+}
+
 static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct inode *inode = file_inode(iocb->ki_filp);
+	struct file *file = iocb->ki_filp;
+	struct fuse_file *ff = file->private_data;
 	struct fuse_io_priv io = FUSE_IO_PRIV_SYNC(iocb);
 	ssize_t res;
+	bool exclusive_lock =
+		!(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) ||
+		iocb->ki_flags & IOCB_APPEND ||
+		fuse_direct_write_extending_i_size(iocb, from);
+
+	/*
+	 * Take exclusive lock if
+	 * - Parallel direct writes are disabled - a user space decision
+	 * - Parallel direct writes are enabled and i_size is being extended.
+	 *   This might not be needed at all, but needs further investigation.
+	 */
+	if (exclusive_lock)
+		inode_lock(inode);
+	else {
+		inode_lock_shared(inode);
+
+		/* A race with truncate might have come up as the decision for
+		 * the lock type was done without holding the lock, check again.
+		 */
+		if (fuse_direct_write_extending_i_size(iocb, from)) {
+			inode_unlock_shared(inode);
+			inode_lock(inode);
+			exclusive_lock = true;
+		}
+	}
 
-	/* Don't allow parallel writes to the same file */
-	inode_lock(inode);
 	res = generic_write_checks(iocb, from);
 	if (res > 0) {
 		if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) {
@@ -1581,7 +1614,10 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from)
 			fuse_write_update_attr(inode, iocb->ki_pos, res);
 		}
 	}
-	inode_unlock(inode);
+	if (exclusive_lock)
+		inode_unlock(inode);
+	else
+		inode_unlock_shared(inode);
 
 	return res;
 }
@@ -2937,6 +2973,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
 
 	if (iov_iter_rw(iter) == WRITE) {
 		fuse_write_update_attr(inode, pos, ret);
+		/* For extending writes we already hold exclusive lock */
 		if (ret < 0 && offset + count > i_size)
 			fuse_do_truncate(file);
 	}
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index 39cfb343faa8..e3c54109bae9 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -200,6 +200,7 @@
  *
  *  7.38
  *  - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry
+ *  - add FOPEN_PARALLEL_DIRECT_WRITES
  */
 
 #ifndef _LINUX_FUSE_H
@@ -307,6 +308,7 @@ struct fuse_file_lock {
  * FOPEN_CACHE_DIR: allow caching this directory
  * FOPEN_STREAM: the file is stream-like (no file position at all)
  * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
+ * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode
  */
 #define FOPEN_DIRECT_IO		(1 << 0)
 #define FOPEN_KEEP_CACHE	(1 << 1)
@@ -314,6 +316,7 @@ struct fuse_file_lock {
 #define FOPEN_CACHE_DIR		(1 << 3)
 #define FOPEN_STREAM		(1 << 4)
 #define FOPEN_NOFLUSH		(1 << 5)
+#define FOPEN_PARALLEL_DIRECT_WRITES	(1 << 6)
 
 /**
  * INIT request/reply flags




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux