Directory metadata inconsistencies and missing output ("mismatched layout" and "no dentry for inode" error)

anand.avati at gmail.com (Anand Avati) · Mon, 18 Feb 2013 11:54:14 -0800

A similar issue was fixed in the master branch recently. Can you apply
http://review.gluster.org/4459 to your source / rebuild / retest and see if
the issue gets fixed for you? It is quite a trivial patch and might even
just apply on 3.2.7 source.

Avati

On Mon, Feb 18, 2013 at 11:29 AM, Douglas Colkitt <douglas.colkitt at gmail.com
> wrote:

> Hi I'm running into a rather strange and frustrating bug and wondering if
> anyone on the mailing list might have some insight about what might be
> causing it. I'm running a cluster of two dozen nodes, where the processing
> nodes are also the gluster bricks (using the SLURM resource manager). Each
> node has the glusters mounted natively (not NFS). All nodes are using
> v3.2.7. Each job in the node runs a shell script like so:
>
> containerDir=$1
> groupNum=$2
> mkdir -p $containerDir
> ./generateGroupGen.py $groupNum >$containerDir/$groupNum.out
>
> Then run the following jobs:
>
> runGroupGen [glusterDirectory] 1
> runGroupGen [glusterDirectory] 2
> runGroupGen [glusterDirectory] 3
> ...
>
> Typically about 200 jobs launch within milliseconds of each other so the
> glusterfs/fuse directory system receives a large number of simultaneous
> create directory and create file system calls within a very short time.
>
> Some of the output files inside the directory have a file that exists but
> no output. When this occurs it is always the case that either all jobs on a
> node behave normally or all fail to produce output. It should be noted that
> there are no error messages generated by the processes themselves, and all
> processes on the no-output node return with no error code. In that sense
> the failure is silent, but corrupts the data, which is dangerous. The only
> indication of error are errors (on the no output nodes) in the
> /var/log/distrib-glusterfs.log of the form:
>
> [2013-02-18 05:55:31.382279] E
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-volume1-client-16: remote
> operation failed: Stale NFS file handle
> [2013-02-18 05:55:31.382302] E
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-volume1-client-17: remote
> operation failed: Stale NFS file handle
> [2013-02-18 05:55:31.382327] E
> [client3_1-fops.c:2228:client3_1_lookup_cbk] 0-volume1-client-18: remote
> operation failed: Stale NFS file handle
> [2013-02-18 05:55:31.640791] W [inode.c:1044:inode_path]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(+0xe8fd) [0x7fa8341868fd]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(+0xa6bb) [0x7fa8341826bb]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(fuse_loc_fill+0x1c6)
> [0x7fa83417d156]))) 0-volume1/inode: no dentry for non-root inode
> -69777006931: 0a37836d-e9e5-4cc1-8bd2-e8a49947959b
> [2013-02-18 05:55:31.640865] W [fuse-bridge.c:561:fuse_getattr]
> 0-glusterfs-fuse: 2298073: GETATTR 140360215569520 (fuse_loc_fill() failed)
> [2013-02-18 05:55:31.641672] W [inode.c:1044:inode_path]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(+0xe8fd) [0x7fa8341868fd]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(+0xa6bb) [0x7fa8341826bb]
> (-->/usr/lib/glusterfs/3.2.7/xlator/mount/fuse.so(fuse_loc_fill+0x1c6)
> [0x7fa83417d156]))) 0-volume1/inode: no dentry for non-root inode
> -69777006931: 0a37836d-e9e5-4cc1-8bd2-e8a49947959b
> [2013-02-18 05:55:31.641724] W [fuse-bridge.c:561:fuse_getattr]
> 0-glusterfs-fuse: 2298079: GETATTR 140360215569520 (fuse_loc_fill() failed)
> ...
>
> Sometimes on these events, and sometimes not, there will also be logs (on
> both normal and abnormal nodes) of the form:
>
> [2013-02-18 03:35:28.679681] I [dht-common.c:525:dht_revalidate_cbk]
> 0-volume1-dht: mismatching layouts for /inSample/pred/20110831
>
> I understand from reading the mailing list that both the dentry errors and
> the mismatched layout errors are both non-fatal warnings and that the
> metadata will become internally consistent regardless. But these errors
> only happen on times when I'm slamming the glusterfs system with the
> creation of a bunch of small files in a very short burst like I described
> above. So their presence seems to be related to the error.
>
> I think the issue is almost assuredly related to the delayed propagation
> of glusterfs directory metadata. Some nodes are creating directory
> simultaneous to other nodes and the two are producing inconsistencies with
> regards to the dht layout information. My hypothesis is that when Node A is
> still writing that the process to resolve the inconsistencies with and
> propagate the metadata from Node B is rendering the location that Node A is
> writing to disconnected from its supposed path. (And hence the no dentry
> errors).
>
> I've taken some effort to go through the glusterfs source code,
> particularly the dht related files. The way dht normalizes anomalies could
> be the problem, but I've failed to find anything specific.
>
> Has anyone else run into a problem like this, or have insight about what
> might be causing it or how to avoid it?
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130218/e4ba525d/attachment-0001.html>