Re: CIFS data coherency problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 12 Sep 2010 23:14:35 +0400
Pavel Shilovsky <piastryyy@xxxxxxxxx> wrote:

> 2010/9/12 Jeff Layton <jlayton@xxxxxxxxx>:
> > write_begin/write_end are called on each page in a write syscall. So if
> > your application is writing in 64k page-aligned chunks, write_end will
> > be called 16 times. When you have no oplock with this patch, for each
> > call to write_end, you're calling cifs_write which will flush each
> > single page synchronously and only that single page to the server. Your
> > 64k write will take 16 round trips to the server to complete.
> >
> > What you probably want to do instead is populate the pagecache with the
> > write contents (as is done today), flush the write and wait on the
> > result. Optionally, you could then invalidate the cache to free up the
> > pagecache pages (though you'll need to take care not to race with other
> > writers).
> 
> Ok, I understand you. What do you think about the following idea?
> 
> 1) if we don't have an exclusive oplock, we write the data to the
> server before do_sync_write but in cifs_write_end we don't mark the
> page as dirty and don't write to the server;
>
> 2) if we have an exclusive oplock, we don't write the data before
> do_sync_write and mark page as dirty in cifs_write_end (again we don't
> write anything to the server in cifs_write_end);
> 

Sorry, I'm not quite following what you mean here. I don't think you
need to mess with cifs_write_end at all.

What I think you basically want to do is replace the filemap_fdatawrite
clause in cifs_file_aio_write with a filemap_write_and_wait. If you
want to get a little more fancy, you could just have it flush the range
of the file actually being written.

After that, you could also consider invalidating the pages in the
mapping (thereby allowing the VM to reclaim them) if you don't have an
oplock.

IOW, you want to write to the pagecache regardless of whether you have
an oplock or not. The oplock only decides whether you flush the data to
the server and wait on the result after writing to the pagecache.

-- 
Jeff Layton <jlayton@xxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-cifs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux