Bug in __generic_file_write_iter()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Al,

I have been looking at __generic_file_write_iter() and think I may have found a bug when O_DIRECT is enabled and the write falls through to generic_perform_write() after the generic_file_direct_write() and then the filemap_write_and_wait_range() fails for some reason.

The relevant code is:

	status = generic_perform_write(file, from, pos);
	[...]
	iocb->ki_pos = pos + status;
	[...]
	err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte);
	if (err == 0) {
		written += status;

Now consider the case where filemap_write_and_wait_range() returned an error.

In that case we return with iocb->ki_pos already advanced to "pos + status" when in fact (some of) the write to disk actually failed.

We then end up returning "written" which does not include "status" in its value but we set file->f_pos to iocb->ki_pos in the write() system call (via file_pos_write(f.file, pos); where "pos" was set to iocb->ki_pos).

Now imagine a userspace application doing a simple write loop:

	do {
		written = write(f, buf, bytes);
		buf += written;
		[...]
	} while (...);

In the above example we will advance "buf" by "written" bytes and then write "buf + written" at the incorrect file offset as the previous write() system call advanced file->f_pos by "written + status" instead of just "written".

If this happens during a FTP upload for example you end up with a destination file which is larger than the source file and the data is shifted/duplicated wherever this error occurs by a shift of "status" bytes or a duplication of "status" bytes.

Assuming you agree with what I am saying above I would propose that the iocb->ki_pos be moved down to the written += status, i.e. this:

Signed-off-by: Anton Altaparmakov <aia21@xxxxxxxxxx>

diff --git a/mm/filemap.c b/mm/filemap.c
index 90effcd..65e5df4 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2606,7 +2606,6 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 			err = status;
 			goto out;
 		}
-		iocb->ki_pos = pos + status;
 		/*
 		 * We need to ensure that the page cache pages are written to
 		 * disk and invalidated to preserve the expected O_DIRECT
@@ -2615,6 +2614,7 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
 		endbyte = pos + status - 1;
 		err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte);
 		if (err == 0) {
+			iocb->ki_pos = pos + status;
 			written += status;
 			invalidate_mapping_pages(mapping,
 						 pos >> PAGE_CACHE_SHIFT,


Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
University of Cambridge Information Services, Roger Needham Building
7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux