Re: Bug in __generic_file_write_iter()?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 26-08-14 14:22:24, Anton Altaparmakov wrote:
> Hi Al,
> 
> I have been looking at __generic_file_write_iter() and think I may have found a bug when O_DIRECT is enabled and the write falls through to generic_perform_write() after the generic_file_direct_write() and then the filemap_write_and_wait_range() fails for some reason.
> 
> The relevant code is:
> 
> 	status = generic_perform_write(file, from, pos);
> 	[...]
> 	iocb->ki_pos = pos + status;
> 	[...]
> 	err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte);
> 	if (err == 0) {
> 		written += status;
> 
> Now consider the case where filemap_write_and_wait_range() returned an error.
> 
> In that case we return with iocb->ki_pos already advanced to "pos + status" when in fact (some of) the write to disk actually failed.
> 
> We then end up returning "written" which does not include "status" in its value but we set file->f_pos to iocb->ki_pos in the write() system call (via file_pos_write(f.file, pos); where "pos" was set to iocb->ki_pos).
> 
> Now imagine a userspace application doing a simple write loop:
> 
> 	do {
> 		written = write(f, buf, bytes);
> 		buf += written;
> 		[...]
> 	} while (...);
> 
> In the above example we will advance "buf" by "written" bytes and then write "buf + written" at the incorrect file offset as the previous write() system call advanced file->f_pos by "written + status" instead of just "written".
> 
> If this happens during a FTP upload for example you end up with a destination file which is larger than the source file and the data is shifted/duplicated wherever this error occurs by a shift of "status" bytes or a duplication of "status" bytes.
> 
> Assuming you agree with what I am saying above I would propose that the iocb->ki_pos be moved down to the written += status, i.e. this:
> 
> Signed-off-by: Anton Altaparmakov <aia21@xxxxxxxxxx>
  Looks good to me. You can add:
Reviewed-by: Jan Kara <jack@xxxxxxx>

								Honza

> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 90effcd..65e5df4 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2606,7 +2606,6 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  			err = status;
>  			goto out;
>  		}
> -		iocb->ki_pos = pos + status;
>  		/*
>  		 * We need to ensure that the page cache pages are written to
>  		 * disk and invalidated to preserve the expected O_DIRECT
> @@ -2615,6 +2614,7 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  		endbyte = pos + status - 1;
>  		err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte);
>  		if (err == 0) {
> +			iocb->ki_pos = pos + status;
>  			written += status;
>  			invalidate_mapping_pages(mapping,
>  						 pos >> PAGE_CACHE_SHIFT,
> 
> 
> Best regards,
> 
> 	Anton
> -- 
> Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
> University of Cambridge Information Services, Roger Needham Building
> 7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux