Re: Geo-replication: Entry not present on master. Fixing gfid mismatch in slave

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey David,

for me a gfid  mismatch means  that the file  was  replaced/recreated  -  just like  vim in linux does (and it is expected for config file).

Have  you checked the gfid  of  the file on both source and destination,  do they really match or they are different ?

What happens  when you move away the file  from the slave ,  does it fixes the issue ?

Best Regards,
Strahil Nikolov

На 30 май 2020 г. 1:10:56 GMT+03:00, David Cunningham <dcunningham@xxxxxxxxxxxxx> написа:
>Hello,
>
>We're having an issue with a geo-replication process with unusually
>high
>CPU use and giving "Entry not present on master. Fixing gfid mismatch
>in
>slave" errors. Can anyone help on this?
>
>We have 3 GlusterFS replica nodes (we'll call the master), which also
>push
>data to a remote server (slave) using geo-replication. This has been
>running fine for a couple of months, but yesterday one of the master
>nodes
>started having unusually high CPU use. It's this process:
>
>root@cafs30:/var/log/glusterfs# ps aux | grep 32048
>root     32048 68.7  0.6 1843140 845756 ?      Rl   02:51 493:51
>python2
>/usr/lib/x86_64-linux-gnu/glusterfs/python/syncdaemon/gsyncd.py worker
>gvol0 nvfs10::gvol0 --feedback-fd 15 --local-path
>/nodirectwritedata/gluster/gvol0 --local-node cafs30 --local-node-id
>b7521445-ee93-4fed-8ced-6a609fa8c7d4 --slave-id
>cdcdb210-839c-4306-a4dc-e696b165ed17 --rpc-fd 12,11,9,13 --subvol-num 1
>--resource-remote nvfs30 --resource-remote-id
>1e698ccd-aeec-4ec4-96fe-383da8fc3b78
>
>Here's what is being logged in
>/var/log/glusterfs/geo-replication/gvol0_nvfs10_gvol0/gsyncd.log:
>
>[2020-05-29 21:57:18.843524] I [master(worker
>/nodirectwritedata/gluster/gvol0):1470:crawl] _GMaster: slave's time
> stime=(1590789408, 0)
>[2020-05-29 21:57:30.626172] I [master(worker
>/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
>_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
>Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
>u'7c0b75e5-d8b7-454f-8010-112d613c599e', u'gid': 117, u'mode': 33204,
>u'entry':
>u'.gfid/c5422396-1578-4b50-a29d-315be2a9c5d8/00a859f7xxxx.cfg',
>u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
>u'slave_name': None, u'slave_gfid':
>u'ec4b0ace-2ec4-4ea5-adbc-9f519b81917c', u'name_mismatch': False,
>u'dst':
>False})
>[2020-05-29 21:57:30.627893] I [master(worker
>/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
>_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
>Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
>u'a4d52e40-2e2f-4885-be5f-65fe95a8ebd7', u'gid': 117, u'mode': 33204,
>u'entry':
>u'.gfid/f857c42e-22f1-4ce4-8f2e-13bdadedde45/polycom_00a859f7xxxx.cfg',
>u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
>u'slave_name': None, u'slave_gfid':
>u'ece8da77-b5ea-45a7-9af7-7d4d8f55f74a', u'name_mismatch': False,
>u'dst':
>False})
>[2020-05-29 21:57:30.629532] I [master(worker
>/nodirectwritedata/gluster/gvol0):813:fix_possible_entry_failures]
>_GMaster: Entry not present on master. Fixing gfid mismatch in slave.
>Deleting the entry    retry_count=1   entry=({u'uid': 108, u'gfid':
>u'3c525ad8-aeb2-46b6-9c41-7fb4987916f8', u'gid': 117, u'mode': 33204,
>u'entry':
>u'.gfid/f857c42e-22f1-4ce4-8f2e-13bdadedde45/00a859f7xxxx-directory.xml',
>u'op': u'CREATE'}, 17, {u'slave_isdir': False, u'gfid_mismatch': True,
>u'slave_name': None, u'slave_gfid':
>u'06717b5a-d842-495d-bd25-aab9cd454490', u'name_mismatch': False,
>u'dst':
>False})
>[2020-05-29 21:57:30.659123] I [master(worker
>/nodirectwritedata/gluster/gvol0):942:handle_entry_failures] _GMaster:
>Sucessfully fixed entry ops with gfid mismatch     retry_count=1
>[2020-05-29 21:57:30.659343] I [master(worker
>/nodirectwritedata/gluster/gvol0):1194:process_change] _GMaster: Retry
>original entries. count = 1
>[2020-05-29 21:57:30.725810] I [master(worker
>/nodirectwritedata/gluster/gvol0):1197:process_change] _GMaster:
>Sucessfully fixed all entry ops with gfid mismatch
>[2020-05-29 21:57:31.747319] I [master(worker
>/nodirectwritedata/gluster/gvol0):1954:syncjob] Syncer: Sync Time Taken
>duration=0.7409 num_files=18    job=1   return_code=0
>
>We've verified that the files like polycom_00a859f7xxxx.cfg referred to
>in
>the error do exist on the master nodes and slave.
>
>We found this bug fix:
>https://bugzilla.redhat.com/show_bug.cgi?id=1642865
>
>However that fix went in 5.1, and we're running 5.12 on the master
>nodes
>and slave. A couple of GlusterFS clients connected to the master nodes
>are
>running 5.13.
>
>Would anyone have any suggestions? Thank you in advance.
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux