Re: [PATCH 4/5] CIFS: New write logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 3 Nov 2010 07:14:56 -0400
Jeff Layton <jlayton@xxxxxxxxxx> wrote:

> On Tue, 2 Nov 2010 20:59:00 +0300
> Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote:
> 
> > 2010/11/2 Jeff Layton <jlayton@xxxxxxxxxx>:
> > > On Tue, 2 Nov 2010 12:02:24 +0300
> > > Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote:
> > >
> > >> Modify cifs_file_aio_write and cifs_write_end to let the client works with
> > >> strict cache mode.
> > >>
> > >> Signed-off-by: Pavel Shilovsky <piastryyy@xxxxxxxxx>
> > >> ---
> > >>  fs/cifs/cifsfs.c |   35 ++++++++++++++++++++++++++++++-----
> > >>  fs/cifs/file.c   |   14 ++++++++++++--
> > >>  2 files changed, 42 insertions(+), 7 deletions(-)
> > >>
> > >> diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
> > >> index 1b44a92..e1ecd35 100644
> > >> --- a/fs/cifs/cifsfs.c
> > >> +++ b/fs/cifs/cifsfs.c
> > >> @@ -602,12 +602,37 @@ static ssize_t cifs_file_aio_read(struct kiocb
> > >> *iocb, const struct iovec *iov,
> > >>  static ssize_t cifs_file_aio_write(struct kiocb *iocb, const struct iovec *iov,
> > >>                                  unsigned long nr_segs, loff_t pos)
> > >>  {
> > >> -     struct inode *inode = iocb->ki_filp->f_path.dentry->d_inode;
> > >> -     ssize_t written;
> > >> +     struct inode *inode;
> > >> +     struct cifs_sb_info *cifs_sb;
> > >> +     ssize_t written, cache_written;
> > >> +     loff_t saved_pos;
> > >> +
> > >> +     inode = iocb->ki_filp->f_path.dentry->d_inode;
> > >> +
> > >> +     if (CIFS_I(inode)->clientCanCacheAll)
> > >> +             return generic_file_aio_write(iocb, iov, nr_segs, pos);
> > >> +
> > >> +     cifs_sb = CIFS_SB(iocb->ki_filp->f_path.dentry->d_sb);
> > >> +
> > >> +     if ((cifs_sb->mnt_cifs_flags & CIFS_MOUNT_STRICT_IO) == 0) {
> > >> +             written = generic_file_aio_write(iocb, iov, nr_segs, pos);
> > >> +             filemap_write_and_wait(inode->i_mapping);
> > >                ^^^^^^^^^^^^^^^^^^
> > >                You can't ignore the return code from this. That function
> > >                may return an error if writeback fails. Also, I don't
> > >                see any need to wait on the result in this case. Why
> > >                not just kick off the I/O and return (do a
> > >                filemap_fdatawrite, IOW).
> > 
> > I don't change non-strict variant for writing - we have this code now
> > in the git tree. But I agree - we should think about return value in
> > this case.
> > 
> > >
> > >> +             return written;
> > >> +     }
> > >> +
> > >> +     saved_pos = pos;
> > >> +     written = cifs_user_write(iocb->ki_filp, iov->iov_base,
> > >> +                               iov->iov_len, &pos);
> > >> +
> > >> +     if (written > 0) {
> > >> +             cache_written = generic_file_aio_write(iocb, iov,
> > >> +                                                    nr_segs, saved_pos);
> > >> +             if (cache_written != written)
> > >> +                     cERROR(1, "Cache written and server written data "
> > >> +                            "lengths are different");
> > >> +     } else
> > >> +             iocb->ki_pos = pos;
> > >>
> > >        ^^^^^
> > > This seems awfully complicated. Why not just do a
> > > generic_file_aio_write to get this into the cache and then just do a
> > > filemap_write_and_wait and deal with the result?
> > 
> > The main reason of doing this is mandatory byte-range locks. If we
> > simply do generic_file_aio_write and then filemap_write_and_wait we
> > can fail in the following situation:
> > 1) process1 opens file and sets a mandatory lock from 0 to 1.
> > 2) process2 opens file and writes a data from 1 to 2.
> > 
> > If we do like you suggest we fail on page writing (on
> > filemap_write_and_wait which writes whole page - from 0 to 2). That's
> > why I do cifs_user_write which writes the data from 1 to 2 (it's what
> > we need) and then store it in the cache by generic_file_aio_write
> > (with the little change in cifs_write_end that doesn't write the same
> > data twice to the server). So, right working of write and read ops
> > with mandatory locks is one of the reasons to provide srict cache
> > semantic.
> > 
> 
> Why store it in the cache at all at that point then? Is that for mmap?
> 
> This seems really ugly and doesn't pass the "sniff test" -- something
> smells foul here. There must be a better way to handle this sort of
> thing...
> 

To elaborate...

Rather than calling down into generic_file_aio_write, I think you'd be
better served by simply invalidating the pages in the range that the
write touched, or possibly just invalidating the entire cached inode.

Also, I still haven't seen a description of what the semantics for mmap
will be in this case. If I'm using strict caching and mmap a file, it
obviously isn't going to read/write through every time userspace
touches the memory. What can I expect to happen when I read or write to
that mmap? How can I ensure that new data will be faulted in or data
that I write will be synced out?

This needs to be settled before we can consider merging this code.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux