Hey folks,
in our production setup with 3 nodes (HCI) we took one host down
(maintenance, stop gluster, poweroff via ssh/ovirt engine). Once it was
up the gluster hat 2k healing entries that went down in a matter on 10
minutes to 2.
Those two give me a headache:
[root@node03:~] # gluster vol heal ssd_storage info
Brick node01:/gluster_bricks/ssd_storage/ssd_storage
<gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6>
<gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e>
Status: Connected
Number of entries: 2
Brick node02:/gluster_bricks/ssd_storage/ssd_storage
Status: Connected
Number of entries: 0
Brick node03:/gluster_bricks/ssd_storage/ssd_storage
<gfid:a121e4fb-0984-4e41-94d7-8f0c4f87f4b6>
<gfid:6f8817dc-3d92-46bf-aa65-a5d23f97490e>
Status: Connected
Number of entries: 2
No paths, only gfid. We took down node2, so it does not have the file:
[root@node01:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
75c4941683b7eabc223fc9d5f022a77c
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
[root@node02:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
md5sum:
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6:
No such file or directory
[root@node03:~] # md5sum
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
75c4941683b7eabc223fc9d5f022a77c
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
The other two files are md5-identical.
These flags are identical, too:
[root@node01:~] # getfattr -d -m . -e hex
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000
trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2
getfattr: Removing leading '/' from absolute path names
# file:
gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.ssd_storage-client-1=0x0000004f0000000100000000
trusted.gfid=0xa121e4fb09844e4194d78f0c4f87f4b6
trusted.gfid2path.d4cf876a215b173f=0x62653331383633382d653861302d346336642d393737642d3761393337616138343830362f38366461303238392d663734662d343230302d393238342d3637386537626437363139352e31323030
trusted.glusterfs.mdata=0x010000000000000000000000005e349b1e000000001139aa2a000000005e349b1e000000001139aa2a000000005e34994900000000304a5eb2
The only thing I can see is the different change times, really:
[root@node01:~] # stat
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
File:
‘/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6’
Size: 67108864 Blocks: 54576 IO Block: 4096 regular file
Device: fd09h/64777d Inode: 16152829909 Links: 2
Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:glusterd_brick_t:s0
Access: 2020-01-31 22:16:57.812620635 +0100
Modify: 2020-02-01 07:19:24.183045141 +0100
Change: 2020-02-01 07:19:24.186045203 +0100
Birth: -
[root@node03:~] # stat
/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6
File:
‘/gluster_bricks/ssd_storage/ssd_storage/.glusterfs/a1/21/a121e4fb-0984-4e41-94d7-8f0c4f87f4b6’
Size: 67108864 Blocks: 54576 IO Block: 4096 regular file
Device: fd09h/64777d Inode: 16154259424 Links: 2
Access: (0660/-rw-rw----) Uid: ( 0/ root) Gid: ( 0/ root)
Context: system_u:object_r:glusterd_brick_t:s0
Access: 2020-01-31 22:16:57.811800217 +0100
Modify: 2020-02-01 07:19:24.180939487 +0100
Change: 2020-02-01 07:19:24.184939586 +0100
Birth: -
Now, I dont dare simply proceeding without some advice.
Anyone got a clue on who to resolve this issue? File #2 is identical to
this one, from a problem point of view.
Have a great weekend!
-Chris.
--
with kind regards,
mit freundlichen Gruessen,
Christian Reiss
________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/441850968
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/441850968
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users