On Fri, 22 Apr 2011 11:28:28 -0500 Steve French <smfrench@xxxxxxxxx> wrote: > On Fri, Apr 22, 2011 at 10:07 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > > On Fri, 22 Apr 2011 09:54:10 -0500 > > Steve French <smfrench@xxxxxxxxx> wrote: > > > >> On Fri, Apr 22, 2011 at 9:10 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >> > On Fri, 22 Apr 2011 09:02:18 -0500 > >> > Steve French <smfrench@xxxxxxxxx> wrote: > >> > > >> >> On Fri, Apr 22, 2011 at 6:50 AM, Jeff Layton <jlayton@xxxxxxxxxx> wrote: > >> >> > On Fri, 22 Apr 2011 12:09:20 +0400 > >> >> > Pavel Shilovsky <piastry@xxxxxxxxxxx> wrote: > >> >> > > >> >> >> After conversation with Steve, we decided to drop > >> >> >> filemap_write_and_wait from getattr, because we already do it in > >> >> >> cifs_file_aio_write. I also think that we should drop it from > >> >> >> cifs_llseek in this case too. I will repost this patch later (launder > >> >> >> page operation patch was merged earlier). > >> >> >> > >> >> > > >> >> > I wasn't privy to this discussion, but that makes no sense to me. Just > >> >> > because we initiated writeout in cifs_file_aio_write, does not mean > >> >> > that it's complete. If it's not complete then the size returned by the > >> >> > server may be bogus. > >> >> > >> >> What would a local file system do in the case when a write is > >> >> racing with a getattr? ÂIn the case of cifs, when we issue > >> >> a write, and don't have oplock, we immediately send the > >> >> write on the network - but AFAIK posix provides no guarantees > >> >> about ordering if they are issued at the same time. > >> >> > >> > > >> > It's not a problem for a local filesystem as there's only one set of > >> > metadata to deal with. Even if the writes aren't synced out to disk, > >> > you still know how big the file is. > >> > > >> > With a client/server setup like cifs or nfs, you have to deal with two, > >> > and when there are buffered writes then there will be discrepancies. > >> > The simplest way to deal with discrepancies there is to make sure > >> > that there aren't any by flushing out any buffered writes before > >> > fetching the attributes. > >> > >> it may sound simpler but doesn't prevent a write racing in just > >> before we send the QueryFileInfo on the wire, and it does > >> hurt performance to redundantly invoke filemap_fdatawrite. > >> > > > > IIUC, POSIX says that the st_size must reflect the results of all > > writes done up to the point that the stat() call was issued. If some > > race in after the fact, then that's generally ok, but we don't want a > > situation where someone issues a ton of writes and then the stat() call > > doesn't reflect the result of them. > > aio_write doesn't return until filemap_fdatawrite completes and it > calls cifs_writepages which is synchronous - so the st_size > on the server will be uptodate if the write has completed. > Used to be synchronous. One of the goals of the async write patchset I'm working on is to change that. -- Jeff Layton <jlayton@xxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-cifs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html