Re: Nfs filesystem corruption(?) after kmail crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 4 Jun 2008 14:10:01 +0200
"Alexander Borghgraef" <alexander.borghgraef.rma@xxxxxxxxx> wrote:

> On Mon, Jun 2, 2008 at 3:43 PM, Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> > On Mon, 2 Jun 2008 15:05:17 +0200
> > "Alexander Borghgraef" <alexander.borghgraef.rma@xxxxxxxxx> wrote:
> >
> >> Nobody? Anyone care to tell me how to interpret the strace stat cur output?
> >>
> >
> >> lstat64("cur", 0xbfb81cb4)              = -1 ENOENT (No such file or directory)
> >
> > File doesn't exist...
> >
> > If this is from "ls -l" or something like that, that means that the
> > client did a READDIR or READDIRPLUS and saw a "cur" entry in the
> > directory with a particular filehandle. It then went back and did a
> > stat() against that filehandle and it was gone. The two possibilities
> > are that something removed that directory in the interim (possibly
> > replacing it with a new "cur" directory), or that the filehandle was
> > bad for some reason. I'm not aware of any bugs causing the latter, so
> > the former is the most likely.
> 
>  So it's possible that kmail in syncing accesses the cur directory,
> reads it, and then removes and replaces the directory before all of
> the read operation's actions are executed due to the difference in
> time granularity between nfs and ext3? If so, should I file this as a
> bug report to the kdepim people? I've looked a bit into the kmail
> code, and I traced the error message to an access (from unistd.h) call
> on the directories path which fails, but that probably just notices
> the problem instead of causing it. I haven't  really figured out how
> their syncing process works.
> 

My suspicion would be rather that this directory is being removed by a
process on a different client (or maybe the server). If this directory
is only being changed by the client itself, then something is definitely
not working right. The client should generally be aware of changes that
it makes itself.

I doubt this is a userspace bug, per-se, though there are certainly
ways to write userspace code that are more friendly to NFS. My
suggestion would be to see about getting some network captures and
determine at what point the filehandle is changing when this happens.

An even better thing would be to track down a way to reliably reproduce
this. With that we could offer a more comprehensive explanation.

-- 
Jeff Layton <jlayton@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux