Re: gluster 5.6: Gfid mismatch detected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 22/05/19 1:29 PM, Hu Bert wrote:
Hi Ravi,

mount path of the volume is /shared/public, so complete paths are
/shared/public/staticmap/120/710/ and
/shared/public/staticmap/120/710/120710351/ .

getfattr -n glusterfs.gfid.string /shared/public/staticmap/120/710/
getfattr: Removing leading '/' from absolute path names
# file: shared/public/staticmap/120/710/
glusterfs.gfid.string="751233b0-7789-4550-bd95-4dd9c8f57c19"

getfattr -n glusterfs.gfid.string /shared/public/staticmap/120/710/120710351/
getfattr: Removing leading '/' from absolute path names
# file: shared/public/staticmap/120/710/120710351/
glusterfs.gfid.string="eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace"

So that fits. It somehow took a couple of attempts to resolve this,
and none of the commands seem to have "officially" succeeded:

gluster3 (host with the "fail"):
gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata
/shared/public/staticmap/120/710/120710351/
Lookup failed on /shared/public/staticmap/120/710:No such file or directory

The file path given to this command must be the absolute path as seen from the root of the volume. So the location where it is mounted (/shared/public) must be omitted. Only /staticmap/120/710/120710351/ is required.

HTH,

Ravi

Volume heal failed.

gluster1 ("good" host):
gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata
/shared/public/staticmap/120/710/120710351/
Lookup failed on /shared/public/staticmap/120/710:No such file or directory
Volume heal failed.

Only in the logs i see:

[2019-05-22 07:42:22.004182] I [MSGID: 108026]
[afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do]
0-workdata-replicate-0: performing metadata selfheal on
eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace
[2019-05-22 07:42:22.008502] I [MSGID: 108026]
[afr-self-heal-common.c:1729:afr_log_selfheal] 0-workdata-replicate-0:
Completed metadata selfheal on eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace.
sources=0 [1]  sinks=2

And via "gluster volume heal workdata statistics heal-count" there are
0 entries left. Files/directories are there. Happened the first time
with this setup, but everything ok now.

Thx for your fast help :-)


Hubert

Am Mi., 22. Mai 2019 um 09:32 Uhr schrieb Ravishankar N
<ravishankar@xxxxxxxxxx>:

On 22/05/19 12:39 PM, Hu Bert wrote:
Hi @ll,

today i updated and rebooted the 3 servers of my replicate 3 setup;
after the 3rd one came up again i noticed this error:

[2019-05-22 06:41:26.781165] E [MSGID: 108008]
[afr-self-heal-common.c:392:afr_gfid_split_brain_source]
0-workdata-replicate-0: Gfid mismatch detected for
<gfid:751233b0-7789-4550-bd95-4dd9c8f57c19>/120710351>,
82025ab3-8034-4257-9628-d8ebde909629 on workdata-client-2 and
eaf2f31e-b4a7-4fa8-b710-d6ff9cd4eace on workdata-client-1.
120710351 seems to be the entry that is in split-brain. Is
/staticmap/120/710/120710351 the complete path to that entry? (check if
gfid:751233b0-7789-4550-bd95-4dd9c8f57c19 corresponds to the gfid of 710).

You can then try "gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata /staticmap/120/710/120710351"

-Ravi

[2019-05-22 06:41:27.069969] W [MSGID: 108027]
[afr-common.c:2270:afr_attempt_readsubvol_set] 0-workdata-replicate-0:
no read subvols for /staticmap/120/710/120710351
[2019-05-22 06:41:27.808532] W [fuse-bridge.c:582:fuse_entry_cbk]
0-glusterfs-fuse: 1834335: LOOKUP() /staticmap/120/710/120710351 => -1
(Transport endpoint is not connected)

A simple 'gluster volume heal workdata' didn't help; 'gluster volume
heal workdata info' says:

Brick gluster1:/gluster/md4/workdata
/staticmap/120/710
/staticmap/120/710/120710351
<gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b>
Status: Connected
Number of entries: 3

Brick gluster2:/gluster/md4/workdata
/staticmap/120/710
/staticmap/120/710/120710351
<gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b>
Status: Connected
Number of entries: 3

Brick gluster3:/gluster/md4/workdata
/staticmap/120/710/120710351
Status: Connected
Number of entries: 1

There's a mismatch in one directory; I tried to follow these instructions:
https://docs.gluster.org/en/latest/Troubleshooting/resolving-splitbrain/

gluster volume heal workdata split-brain source-brick
gluster1:/gluster/md4/workdata
gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b
Healing gfid:fe7fdbe8-9a39-4793-8d38-6dfdd3d5089b failed: File not in
split-brain.
Volume heal failed.

        
Is there any other documentation for gfid mismatch and how to resolve this?


Thx,
Hubert
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux