Geo Replication Stale File Handle with Reached Maximum Retries

Lazuardi Nasution <mrxlazuardin@xxxxxxxxx> · Fri, 23 Nov 2018 05:28:25 +0700

Hi,
I'm using Gluster 4.1.5 with CentOS 7.5 to do geo replication. Sometime, replication of some bricks are going faulty with following error on master.

[2018-11-22 21:25:41.228754] E [repce(worker /mnt/BRICK7):197:__call__] RepceClient: call failed        call=32010:140439004231488:1542921938.88        method=entry_ops     error=OSError
[2018-11-22 21:25:41.229327] E [syncdutils(worker /mnt/BRICK7):332:log_raise_exception] <top>: FAIL:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 311, in main
    func(args)
  File "/usr/libexec/glusterfs/python/syncdaemon/subcmds.py", line 72, in subcmd_worker
    local.service_loop(remote)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1295, in service_loop
    g3.crawlwrap(_oneshot_=True)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 615, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1545, in crawl
    self.changelogs_batch_process(changes)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1445, in changelogs_batch_process
    self.process(batch)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1280, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1179, in process_change
    failures = self.slave.server.entry_ops(entries)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 216, in __call__
    return self.ins(self.meth, *a)
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 198, in __call__
    raise res
OSError: [Errno 116] Stale file handle

At slave I find following error too.

[2018-11-22 21:25:41.217788] E [repce(slave gluster-eadmin-data.vm/mnt/BRICK7):105:worker] <top>: call failed:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 101, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 675, in entry_ops
    uid, gid)
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 526, in rename_with_disk_gfid_confirmation
    [ENOENT, EEXIST], [ESTALE, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 540, in errno_wrap
    return call(*arg)
OSError: [Errno 116] Stale file handle

I think those error are related to following warning on slave.

[2018-11-22 21:25:41.217561] W [syncdutils(slave gluster-eadmin-data.vm/mnt/BRICK7):552:errno_wrap] <top>: reached maximum retries      args=['.gfid/86ba8c38-5ab0-417e-9130-64dd2d7cf4aa/glue_app_debug_log.log.79', '.gfid/86ba8c38-5ab0-417e-9130-64dd2d7cf4aa/glue_app_debug_log.log.80']        error=[Errno 116] Stale file handle

Those error are gone if I move stated files (glue_app_debug_log.log.79 and glue_app_debug_log.log.80 in this case) from Gluster mount to temporary place and move back to origin place of Gluster mount. How can I solve this case?

Best regards,
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users