Hi Al, I have been looking at __generic_file_write_iter() and think I may have found a bug when O_DIRECT is enabled and the write falls through to generic_perform_write() after the generic_file_direct_write() and then the filemap_write_and_wait_range() fails for some reason. The relevant code is: status = generic_perform_write(file, from, pos); [...] iocb->ki_pos = pos + status; [...] err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte); if (err == 0) { written += status; Now consider the case where filemap_write_and_wait_range() returned an error. In that case we return with iocb->ki_pos already advanced to "pos + status" when in fact (some of) the write to disk actually failed. We then end up returning "written" which does not include "status" in its value but we set file->f_pos to iocb->ki_pos in the write() system call (via file_pos_write(f.file, pos); where "pos" was set to iocb->ki_pos). Now imagine a userspace application doing a simple write loop: do { written = write(f, buf, bytes); buf += written; [...] } while (...); In the above example we will advance "buf" by "written" bytes and then write "buf + written" at the incorrect file offset as the previous write() system call advanced file->f_pos by "written + status" instead of just "written". If this happens during a FTP upload for example you end up with a destination file which is larger than the source file and the data is shifted/duplicated wherever this error occurs by a shift of "status" bytes or a duplication of "status" bytes. Assuming you agree with what I am saying above I would propose that the iocb->ki_pos be moved down to the written += status, i.e. this: Signed-off-by: Anton Altaparmakov <aia21@xxxxxxxxxx> diff --git a/mm/filemap.c b/mm/filemap.c index 90effcd..65e5df4 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2606,7 +2606,6 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) err = status; goto out; } - iocb->ki_pos = pos + status; /* * We need to ensure that the page cache pages are written to * disk and invalidated to preserve the expected O_DIRECT @@ -2615,6 +2614,7 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) endbyte = pos + status - 1; err = filemap_write_and_wait_range(file->f_mapping, pos, endbyte); if (err == 0) { + iocb->ki_pos = pos + status; written += status; invalidate_mapping_pages(mapping, pos >> PAGE_CACHE_SHIFT, Best regards, Anton -- Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @) University of Cambridge Information Services, Roger Needham Building 7 JJ Thomson Avenue, Cambridge, CB3 0RB, UK -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html