inconsistent file content after killing nfs daemon

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



--------------4D54FCED2D7874743D1F6ADE
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by mail.redhat.com id g0G9aTC21222

Hi,

I look at the same problem in synchronous mode now, on Linux nfs source b=
asis.

Even in synchronous mode (O_SYNC ...), it seems the nfs client sends as m=
any
write request to the server as the user data is splitted into cache pages=
 on
client side (ref.  nfs_writepage_sync() , nfs_writepage() in fs/nfs/write=
.c
,nfs_updatepage() in fs/nfs/write.c , nfs_commit_write() in fs/nfs/file.c=
 ,
generic_file_write() in mm/filemap.c , nfs_file_write in fs/nfs/file.c)

For instance if i request a synchronous write of 16 bytes at offset PAGE_=
SIZE - 8
of my file, i think nfs client
will send two WRITE messages to the server. Even if it uses "stable =3D
NFS_FILE_SYNC"  for these two messages,
a failure of the server can occur after the first one has been writen by =
ext3 on
stable storage and not the second one.
Then, if the client restart upon server failure (this is the case in some
project) the file is found with
only 8 bytes updated instead of 16.

I propose the following conditions to provide atomicity of write through =
nsf +
ext3, with the current implementation:
- ext3 journaled mode
- wsize mount option >=3D PAGE_SIZE
- O_SYNC on open()
- data size < =3D PAGE_SIZE
- file offset (PAGE_SIZE) + data-size <=3D PAGE_SIZE

Another possibility can be to modify the nfs implementation to have only =
one
WRITE message when the
the total size in less than wsize (whatever the number of cache pages use=
d) ?

Regards,
Eric


"Stephen C. Tweedie" a =E9crit :

> Hi,
>
> On Fri, Jan 11, 2002 at 10:37:37AM +0100, eric chacron wrote:
>
> > To answer your question, the problem seems to be reproductible only i=
n
> > asynchronous mode (without O_SYNC).
> > I have reproduced the case ( without O_SYNC) using different record s=
izes:
> > from 1 K to 64 K, but not with 512 bytes.
> > It makes sense that the the zeroed holes in the file are caused by th=
e nfs
> > client absence of serialisation/ ordering as the file is used in exte=
nsion.
> > With O_SYNC i haven't reproduced the same problem for the moment.
>
> Right --- that's standard unix semantics for writeback.  Writes to
> backing store are completely unordered unless you request ordering
> with O_SYNC or f[data]sync.
>
> Cheers,
>  Stephen
>
> _______________________________________________
> 
> Ext3-users@redhat.com
> https://listman.redhat.com/mailman/listinfo/ext3-users

--------------4D54FCED2D7874743D1F6ADE
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
Hi,
<p>I look at the same problem in synchronous mode now, on Linux nfs source
basis.&nbsp;&nbsp;
<p>Even in synchronous mode (O_SYNC ...), it seems the nfs client sends
as many write request to the server as the user data is splitted into cache
pages on client side (<i>ref.</i>&nbsp; <i>nfs_writepage_sync() , nfs_writepage()
in fs/nfs/write.c ,nfs_updatepage() in fs/nfs/write.c , nfs_commit_write()
in fs/nfs/file.c , generic_file_write() in mm/filemap.c , nfs_file_write
in fs/nfs/file.c)</i>
<p>For instance if i request a synchronous write of 16 bytes at offset
PAGE_SIZE - 8 of my file, i think nfs client
<br>will send two WRITE messages to the server. Even if it uses "stable
= NFS_FILE_SYNC"&nbsp; for these two messages,
<br>a failure of the server can occur after the first one has been writen
by ext3 on stable storage and not the second one.
<br>Then, if the client restart upon server failure (this is the case in
some project) the file is found with
<br>only 8 bytes updated instead of 16.
<p>I propose the following conditions to provide atomicity of write through
nsf + ext3, with the current implementation:&nbsp;
<br>- ext3 journaled mode
<br>- wsize mount option >= PAGE_SIZE
<br>- O_SYNC on open()&nbsp;
<br>- data size &lt; = PAGE_SIZE
<br>- file offset (PAGE_SIZE) + data-size &lt;= PAGE_SIZE
<br>&nbsp;
<br>Another possibility can be to modify the nfs implementation to have
only one WRITE message when the
<br>the total size in less than wsize (whatever the number of cache pages
used) ?&nbsp;
<p>Regards,
<br>Eric
<br>&nbsp;
<p>"Stephen C. Tweedie" a &eacute;crit :
<blockquote TYPE=CITE>Hi,
<p>On Fri, Jan 11, 2002 at 10:37:37AM +0100, eric chacron wrote:
<p>> To answer your question, the problem seems to be reproductible only
in
<br>> asynchronous mode (without O_SYNC).
<br>> I have reproduced the case ( without O_SYNC) using different record
sizes:
<br>> from 1 K to 64 K, but not with 512 bytes.
<br>> It makes sense that the the zeroed holes in the file are caused by
the nfs
<br>> client absence of serialisation/ ordering as the file is used in
extension.
<br>> With O_SYNC i haven't reproduced the same problem for the moment.
<p>Right --- that's standard unix semantics for writeback.&nbsp; Writes
to
<br>backing store are completely unordered unless you request ordering
<br>with O_SYNC or f[data]sync.
<p>Cheers,
<br>&nbsp;Stephen
<p>_______________________________________________
<br>
<br>Ext3-users@redhat.com
<br><a href="https://listman.redhat.com/mailman/listinfo/ext3-users";>https://listman.redhat.com/mailman/listinfo/ext3-users</a></blockquote>
</html>

--------------4D54FCED2D7874743D1F6ADE--





[Index of Archives]         [Linux RAID]     [Kernel Development]     [Red Hat Install]     [Video 4 Linux]     [Postgresql]     [Fedora]     [Gimp]     [Yosemite News]

  Powered by Linux