It look even worse than I had feared..
:-(
This really is a crazy bug. If I understand you correctly, the only sane pairing of the xattrs is of the two 0-bit files, since this is the full list of bricks: root@gluster01 ~]# gluster volume info Volume Name: sr_vol01 Type: Distributed-Replicate Volume ID: c6d6147e-2d91-4d98-b8d9-ba05ec7e4ad6 Status: Started Number of Bricks: 21 x 2 = 42 Transport-type: tcp Bricks: Brick1: gluster01:/export/brick1gfs01 Brick2: gluster02:/export/brick1gfs02 Brick3: gluster01:/export/brick4gfs01 Brick4: gluster03:/export/brick4gfs03 Brick5: gluster02:/export/brick4gfs02 Brick6: gluster03:/export/brick1gfs03 Brick7: gluster01:/export/brick2gfs01 Brick8: gluster02:/export/brick2gfs02 Brick9: gluster01:/export/brick5gfs01 Brick10: gluster03:/export/brick5gfs03 Brick11: gluster02:/export/brick5gfs02 Brick12: gluster03:/export/brick2gfs03 Brick13: gluster01:/export/brick3gfs01 Brick14: gluster02:/export/brick3gfs02 Brick15: gluster01:/export/brick6gfs01 Brick16: gluster03:/export/brick6gfs03 Brick17: gluster02:/export/brick6gfs02 Brick18: gluster03:/export/brick3gfs03 Brick19: gluster01:/export/brick8gfs01 Brick20: gluster02:/export/brick8gfs02 Brick21: gluster01:/export/brick9gfs01 Brick22: gluster02:/export/brick9gfs02 Brick23: gluster01:/export/brick10gfs01 Brick24: gluster03:/export/brick10gfs03 Brick25: gluster01:/export/brick11gfs01 Brick26: gluster03:/export/brick11gfs03 Brick27: gluster02:/export/brick10gfs02 Brick28: gluster03:/export/brick8gfs03 Brick29: gluster02:/export/brick11gfs02 Brick30: gluster03:/export/brick9gfs03 Brick31: gluster01:/export/brick12gfs01 Brick32: gluster02:/export/brick12gfs02 Brick33: gluster01:/export/brick13gfs01 Brick34: gluster02:/export/brick13gfs02 Brick35: gluster01:/export/brick14gfs01 Brick36: gluster03:/export/brick14gfs03 Brick37: gluster01:/export/brick15gfs01 Brick38: gluster03:/export/brick15gfs03 Brick39: gluster02:/export/brick14gfs02 Brick40: gluster03:/export/brick12gfs03 Brick41: gluster02:/export/brick15gfs02 Brick42: gluster03:/export/brick13gfs03 The two 0-bit files are on brick 35 and 36 as the getfattr correctly lists. Another sane pairing could be this (if the first file did not also refer to client-34 and client-35): [root@gluster01 ~]# getfattr -m . -d -e hex /export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick13gfs01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.sr_vol01-client-32=0x000000000000000000000000 trusted.afr.sr_vol01-client-33=0x000000000000000000000000 trusted.afr.sr_vol01-client-34=0x000000000000000000000000 trusted.afr.sr_vol01-client-35=0x000000010000000100000000 trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 [root@gluster02 ~]# getfattr -m . -d -e hex /export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd getfattr: Removing leading '/' from absolute path names # file: export/brick13gfs02/272b2366-dfbf-ad47-2a0f-5d5cc40863e3/3009f448-cf6e-413f-baec-c3b9f0cf9d72.vhd security.selinux=0x73797374656d5f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.dirty=0x000000000000000000000000 trusted.afr.sr_vol01-client-32=0x000000000000000000000000 trusted.afr.sr_vol01-client-33=0x000000000000000000000000 trusted.gfid=0xaefd184508414a8f8408f1ab8aa7a417 But why is the security.selinux hash different? You mention hostname changes.. I noticed that if I do a listing of available shared storages on one of the XenServer I get: uuid ( RO) : 272b2366-dfbf-ad47-2a0f-5d5cc40863e3 name-label ( RW): gluster_store name-description ( RW): NFS SR [gluster01.irceline.be:/sr_vol01] host ( RO): <shared> type ( RO): nfs content-type ( RO): if I do normal general linux: [root@same_story_on_both_xenserver ~]# mount gluster02.irceline.be:/sr_vol01/272b2366-dfbf-ad47-2a0f-5d5cc40863e3 on /var/run/sr-mount/272b2366-dfbf-ad47-2a0f-5d5cc40863e3 type nfs (rw,soft,timeo=133,retrans=2147483647,tcp,noac,addr=192.168.0.72) Originally the mount was done on gluster01 (ip 192.168.0.71) as the name-description of the xe sr-list indicates.. It is as though when gluster01 was not available for a couple of minutes, the NFS mount internally was somehow automatically reconfigured to gluster02, but NFS cannot do this as far as I know (unless there is some fail-over mechanism - I never configured this). There also is no load-balancing between client and server. If gluster01 is not available, the gluster volume should not have been available, end of story.. But from perspective of a client the NFS could be to any one of the three gluster nodes. The client should see exactly the same data.. So a rebalance in the current state could do more harm than good? I launched a second rebalance in the hope that the system would mend itself after all... Thanks a million for your support in this darkest hour of my time as a glusterfs user :-) Cheers, Olav On 20/02/15 23:10, Joe Julian wrote:
|
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users