On 03/28/2011 12:43 PM, Burnash, James wrote: > I am receiving an error on a client trying to access a gluster mount > (/pfs2, in this case). > > [2011-03-28 12:26:17.897887] I > [dht-layout.c:588:dht_layout_normalize] pfs-ro1-dht: found anomalies > in /. holes=1 overlaps=2 > > This is seen on the client in the /var/log/glusterfs/pfs2.log, which > is the mount point associated with that storage. > > All other clients accessing the same storage do not have the hanging > symptom, and have no such entry in their logs. > > One possibly helpful note - this node worked fine until I upgraded > the client from 3.1.1-1 to 3.1.3-1 on the x86_64 architecture, > running CentOS 5.2. Even after I completely uninstalled GlusterFS > from this node and reinstalled 3.1.1-1, the problem persisted. > > Here is the RPM info: > > root at jc1lnxsamm33:~# rpm -qa fuse fuse-2.7.4-8.el5.x86_64 > root at jc1lnxsamm33:~# rpm -qa "glusterfs*" > glusterfs-fuse-3.1.1-1.x86_64 glusterfs-core-3.1.1-1.x86_64 > glusterfs-debuginfo-3.1.1-1.x86_64 > > Servers are 4 Replicated-Distribute machines running CentOS 5.5 and > GlusterFs 3.1.3-1. > > Volume Name: pfs-ro1 Type: Distributed-Replicate Status: Started > Number of Bricks: 20 x 2 = 40 Transport-type: tcp Bricks: Brick1: > jc1letgfs17-pfs1:/export/read-only/g01 Brick2: > jc1letgfs18-pfs1:/export/read-only/g01 Brick3: > jc1letgfs17-pfs1:/export/read-only/g02 Brick4: > jc1letgfs18-pfs1:/export/read-only/g02 ... Brick35: > jc1letgfs14-pfs1:/export/read-only/g08 Brick36: > jc1letgfs15-pfs1:/export/read-only/g08 Brick37: > jc1letgfs14-pfs1:/export/read-only/g09 Brick38: > jc1letgfs15-pfs1:/export/read-only/g09 Brick39: > jc1letgfs14-pfs1:/export/read-only/g10 Brick40: > jc1letgfs15-pfs1:/export/read-only/g10 Options Reconfigured: > performance.stat-prefetch: on performance.cache-size: 2GB > network.ping-timeout: 10 > > Any help greatly appreciated. Can you execute the following command on each of the brick roots? getfattr -d -e hex -n trusted.glusterfs.dht $brick_root That should give a clearer picture of what the layouts look like, and what those gaps/overlaps are. How they happened is a bit of another story. I see this kind of thing pretty often, but I know it's because of some Weird Stuff (tm) I do in CloudFS. I'm not aware of any bugs etc. that would cause this in other contexts.