Re: i_mutex locking in generic_file_splice_write()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 12, 2006 at 12:54:09PM -0700, Andrew Morton wrote:
> > Shouldn't we be taking this before calling into ->prepare_write() and
> > ->commit_write(). What's preventing generic_file_splice_write() from racing
> > a truncate? Or maybe even another write?
> 
> The lock_page() will block truncate and will block write()s to this particular
> page.
Ok.


> > A quick look through other callers reveals that generic_file_aio_write() and
> > do_lo_send_aops() both are careful to take i_mutex.
> 
> I'm trying to remember what i_mutex actually protects in this context. 
> i_size, certainly - if we go changing the file size without locks then
> other places might get surprised.  For example, a concurrent write() at a
> larger file offset might try to increase i_size but if it loses the race
> against the unlocked i_size-changing thread, the inode ends up with the
> smaller i_size.
I'm also worried about concurrent allocation tree changes. Perhaps I'm
mistaken and all file systems we care about can handle them happening
concurrently, but otherwise couldn't two processes writing to different
sparse regions in a file cause problems? One process via file write and the
other via a splice write.


> So yup, we need i_mutex if only for that reason.
Ok. Here's a first pass. The double lock is ugly, but as far as I can tell
we need it. Unless there's a rule about ordering between pipe inodes and
"other" inodes that I don't know about.

Compile tested only. I probably won't get a chance to actually run it until
late this weekend at the earliest :/
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh@xxxxxxxxxx


From: Mark Fasheh <mark.fasheh@xxxxxxxxxx>

[PATCH] Take i_mutex in splice_from_pipe()

The splice_actor may be calling ->prepare_write() and ->commit_write(). We
want i_mutex on the inode being written to before calling those so that we
don't race i_size changes.

Signed-off-by: Mark Fasheh <mark.fasheh@xxxxxxxxxx>

---

 fs/splice.c |   21 +++++++++++++++++++--
 1 files changed, 19 insertions(+), 2 deletions(-)

96e880f22bd1c0e2809ebbfe5bf122ed67019e33
diff --git a/fs/splice.c b/fs/splice.c
index 13e92dd..e1ecb9e 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -713,6 +713,7 @@ ssize_t splice_from_pipe(struct pipe_ino
 {
 	int ret, do_wakeup, err;
 	struct splice_desc sd;
+	struct inode *inode = out->f_mapping->host;
 
 	ret = 0;
 	do_wakeup = 0;
@@ -722,8 +723,23 @@ ssize_t splice_from_pipe(struct pipe_ino
 	sd.file = out;
 	sd.pos = *ppos;
 
-	if (pipe->inode)
-		mutex_lock(&pipe->inode->i_mutex);
+	/*
+	 * The actor worker might be calling ->prepare_write and
+	 * ->commit_write. Most of the time, these expect i_mutex to
+	 * be held. Since this may result in an ABBA deadlock with
+	 * pipe->inode, we have to order lock acquiry here.
+	 */
+	if (pipe->inode) {
+		if (pipe->inode < inode) {
+			mutex_lock_nested(&pipe->inode->i_mutex, I_MUTEX_PARENT);
+			mutex_lock_nested(&inode->i_mutex, I_MUTEX_CHILD);
+		} else {
+			mutex_lock_nested(&inode->i_mutex, I_MUTEX_PARENT);
+			mutex_lock_nested(&pipe->inode->i_mutex, I_MUTEX_CHILD);
+		}
+	} else {
+		mutex_lock(&inode->i_mutex);
+	}
 
 	for (;;) {
 		if (pipe->nrbufs) {
@@ -799,6 +815,7 @@ ssize_t splice_from_pipe(struct pipe_ino
 
 	if (pipe->inode)
 		mutex_unlock(&pipe->inode->i_mutex);
+	mutex_unlock(&inode->i_mutex);
 
 	if (do_wakeup) {
 		smp_mb();
-- 
1.3.3

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux