--------------4D54FCED2D7874743D1F6ADE Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by mail.redhat.com id g0G9aTC21222 Hi, I look at the same problem in synchronous mode now, on Linux nfs source b= asis. Even in synchronous mode (O_SYNC ...), it seems the nfs client sends as m= any write request to the server as the user data is splitted into cache pages= on client side (ref. nfs_writepage_sync() , nfs_writepage() in fs/nfs/write= .c ,nfs_updatepage() in fs/nfs/write.c , nfs_commit_write() in fs/nfs/file.c= , generic_file_write() in mm/filemap.c , nfs_file_write in fs/nfs/file.c) For instance if i request a synchronous write of 16 bytes at offset PAGE_= SIZE - 8 of my file, i think nfs client will send two WRITE messages to the server. Even if it uses "stable =3D NFS_FILE_SYNC" for these two messages, a failure of the server can occur after the first one has been writen by = ext3 on stable storage and not the second one. Then, if the client restart upon server failure (this is the case in some project) the file is found with only 8 bytes updated instead of 16. I propose the following conditions to provide atomicity of write through = nsf + ext3, with the current implementation: - ext3 journaled mode - wsize mount option >=3D PAGE_SIZE - O_SYNC on open() - data size < =3D PAGE_SIZE - file offset (PAGE_SIZE) + data-size <=3D PAGE_SIZE Another possibility can be to modify the nfs implementation to have only = one WRITE message when the the total size in less than wsize (whatever the number of cache pages use= d) ? Regards, Eric "Stephen C. Tweedie" a =E9crit : > Hi, > > On Fri, Jan 11, 2002 at 10:37:37AM +0100, eric chacron wrote: > > > To answer your question, the problem seems to be reproductible only i= n > > asynchronous mode (without O_SYNC). > > I have reproduced the case ( without O_SYNC) using different record s= izes: > > from 1 K to 64 K, but not with 512 bytes. > > It makes sense that the the zeroed holes in the file are caused by th= e nfs > > client absence of serialisation/ ordering as the file is used in exte= nsion. > > With O_SYNC i haven't reproduced the same problem for the moment. > > Right --- that's standard unix semantics for writeback. Writes to > backing store are completely unordered unless you request ordering > with O_SYNC or f[data]sync. > > Cheers, > Stephen > > _______________________________________________ > > Ext3-users@redhat.com > https://listman.redhat.com/mailman/listinfo/ext3-users --------------4D54FCED2D7874743D1F6ADE Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> Hi, <p>I look at the same problem in synchronous mode now, on Linux nfs source basis. <p>Even in synchronous mode (O_SYNC ...), it seems the nfs client sends as many write request to the server as the user data is splitted into cache pages on client side (<i>ref.</i> <i>nfs_writepage_sync() , nfs_writepage() in fs/nfs/write.c ,nfs_updatepage() in fs/nfs/write.c , nfs_commit_write() in fs/nfs/file.c , generic_file_write() in mm/filemap.c , nfs_file_write in fs/nfs/file.c)</i> <p>For instance if i request a synchronous write of 16 bytes at offset PAGE_SIZE - 8 of my file, i think nfs client <br>will send two WRITE messages to the server. Even if it uses "stable = NFS_FILE_SYNC" for these two messages, <br>a failure of the server can occur after the first one has been writen by ext3 on stable storage and not the second one. <br>Then, if the client restart upon server failure (this is the case in some project) the file is found with <br>only 8 bytes updated instead of 16. <p>I propose the following conditions to provide atomicity of write through nsf + ext3, with the current implementation: <br>- ext3 journaled mode <br>- wsize mount option >= PAGE_SIZE <br>- O_SYNC on open() <br>- data size < = PAGE_SIZE <br>- file offset (PAGE_SIZE) + data-size <= PAGE_SIZE <br> <br>Another possibility can be to modify the nfs implementation to have only one WRITE message when the <br>the total size in less than wsize (whatever the number of cache pages used) ? <p>Regards, <br>Eric <br> <p>"Stephen C. Tweedie" a écrit : <blockquote TYPE=CITE>Hi, <p>On Fri, Jan 11, 2002 at 10:37:37AM +0100, eric chacron wrote: <p>> To answer your question, the problem seems to be reproductible only in <br>> asynchronous mode (without O_SYNC). <br>> I have reproduced the case ( without O_SYNC) using different record sizes: <br>> from 1 K to 64 K, but not with 512 bytes. <br>> It makes sense that the the zeroed holes in the file are caused by the nfs <br>> client absence of serialisation/ ordering as the file is used in extension. <br>> With O_SYNC i haven't reproduced the same problem for the moment. <p>Right --- that's standard unix semantics for writeback. Writes to <br>backing store are completely unordered unless you request ordering <br>with O_SYNC or f[data]sync. <p>Cheers, <br> Stephen <p>_______________________________________________ <br> <br>Ext3-users@redhat.com <br><a href="https://listman.redhat.com/mailman/listinfo/ext3-users">https://listman.redhat.com/mailman/listinfo/ext3-users</a></blockquote> </html> --------------4D54FCED2D7874743D1F6ADE--