2010/11/2 Jeff Layton <jlayton@xxxxxxxxxx>: > On Tue, 2 Nov 2010 12:02:24 +0300 > Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote: > >> Modify cifs_file_aio_write and cifs_write_end to let the client works with >> strict cache mode. >> >> Signed-off-by: Pavel Shilovsky <piastryyy@xxxxxxxxx> >> --- >> fs/cifs/cifsfs.c | 35 ++++++++++++++++++++++++++++++----- >> fs/cifs/file.c | 14 ++++++++++++-- >> 2 files changed, 42 insertions(+), 7 deletions(-) >> >> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c >> index 1b44a92..e1ecd35 100644 >> --- a/fs/cifs/cifsfs.c >> +++ b/fs/cifs/cifsfs.c >> @@ -602,12 +602,37 @@ static ssize_t cifs_file_aio_read(struct kiocb >> *iocb, const struct iovec *iov, >> static ssize_t cifs_file_aio_write(struct kiocb *iocb, const struct iovec *iov, >> unsigned long nr_segs, loff_t pos) >> { >> - struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode; >> - ssize_t written; >> + struct inode *inode; >> + struct cifs_sb_info *cifs_sb; >> + ssize_t written, cache_written; >> + loff_t saved_pos; >> + >> + inode = iocb->ki_filp->f_path.dentry->d_inode; >> + >> + if (CIFS_I(inode)->clientCanCacheAll) >> + return generic_file_aio_write(iocb, iov, nr_segs, pos); >> + >> + cifs_sb = CIFS_SB(iocb->ki_filp->f_path.dentry->d_sb); >> + >> + if ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_STRICT_IO) == 0) { >> + written = generic_file_aio_write(iocb, iov, nr_segs, pos); >> + filemap_write_and_wait(inode->i_mapping); > ^^^^^^^^^^^^^^^^^^ > You can't ignore the return code from this. That function > may return an error if writeback fails. Also, I don't > see any need to wait on the result in this case. Why > not just kick off the I/O and return (do a > filemap_fdatawrite, IOW). I don't change non-strict variant for writing - we have this code now in the git tree. But I agree - we should think about return value in this case. > >> + return written; >> + } >> + >> + saved_pos = pos; >> + written = cifs_user_write(iocb->ki_filp, iov->iov_base, >> + iov->iov_len, &pos); >> + >> + if (written > 0) { >> + cache_written = generic_file_aio_write(iocb, iov, >> + nr_segs, saved_pos); >> + if (cache_written != written) >> + cERROR(1, "Cache written and server written data " >> + "lengths are different"); >> + } else >> + iocb->ki_pos = pos; >> > ^^^^^ > This seems awfully complicated. Why not just do a > generic_file_aio_write to get this into the cache and then just do a > filemap_write_and_wait and deal with the result? The main reason of doing this is mandatory byte-range locks. If we simply do generic_file_aio_write and then filemap_write_and_wait we can fail in the following situation: 1) process1 opens file and sets a mandatory lock from 0 to 1. 2) process2 opens file and writes a data from 1 to 2. If we do like you suggest we fail on page writing (on filemap_write_and_wait which writes whole page - from 0 to 2). That's why I do cifs_user_write which writes the data from 1 to 2 (it's what we need) and then store it in the cache by generic_file_aio_write (with the little change in cifs_write_end that doesn't write the same data twice to the server). So, right working of write and read ops with mandatory locks is one of the reasons to provide srict cache semantic. > >> - written = generic_file_aio_write(iocb, iov, nr_segs, pos); >> - if (!CIFS_I(inode)->clientCanCacheAll) >> - filemap_fdatawrite(inode->i_mapping); >> return written; >> } >> >> diff --git a/fs/cifs/file.c b/fs/cifs/file.c >> index 3970e68..4796748 100644 >> --- a/fs/cifs/file.c >> +++ b/fs/cifs/file.c >> @@ -1539,7 +1539,11 @@ static int cifs_write_end(struct file *file, >> struct address_space *mapping, >> struct page *page, void *fsdata) >> { >> int rc; >> - struct inode *inode = mapping->host; >> + struct inode *inode; >> + struct cifs_sb_info *cifs_sb; >> + >> + inode = mapping->host; >> + cifs_sb = CIFS_SB(inode->i_sb); >> >> cFYI(1, "write_end for page %p from pos %lld with %d bytes", >> page, pos, copied); >> @@ -1572,7 +1576,13 @@ static int cifs_write_end(struct file *file, >> struct address_space *mapping, >> } else { >> rc = copied; >> pos += copied; >> - set_page_dirty(page); >> + /* if we have strict cache switched on and don't have Exclusive >> + oplock for the inode, we don't have to set_page_dirty >> + because we flushed the data to the server in >> + cifs_file_aio_write before */ >> + if ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_STRICT_IO) == 0 || >> + CIFS_I(inode)->clientCanCacheAll) >> + set_page_dirty(page); >> } >> >> if (rc > 0) { > > > -- > Jeff Layton <jlayton@xxxxxxxxxx> > -- Best regards, Pavel Shilovsky. -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html