Geo Replication OSError: [Errno 107] Transport endpoint is not connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've a big Problem.

If I start geo-replication everything seems fine, but after replicating 2.5TB I got errors, it's starting over an over again with the same errors.

I've two nodes with a replicated volume and a third arbiter node.
The destination node is a single node.
The firewall between all nodes ist open.

Master Log

[2018-10-25 07:08:59.619699] D [master(/gluster/owncloud/brick2):1665:Xcrawl] _GMaster: entering ./data/fa/files/backup/research/projects/2011-Regularity/2012-03-Gain-of-Regularity-linearWFP [2018-10-25 07:08:59.619874] E [syncdutils(/gluster/owncloud/brick2):325:log_raise_exception] <top>: glusterfs session went down        error=ENOTCONN [2018-10-25 07:08:59.620109] E [syncdutils(/gluster/owncloud/brick2):331:log_raise_exception] <top>: FULL EXCEPTION TRACE:
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 210, in main
    main_i()
  File "/usr/libexec/glusterfs/python/syncdaemon/gsyncd.py", line 801, in main_i
    local.service_loop(*[r for r in [remote] if r])
  File "/usr/libexec/glusterfs/python/syncdaemon/resource.py", line 1679, in service_loop
    g1.crawlwrap(oneshot=True, register_time=register_time)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 597, in crawlwrap
    self.crawl()
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1555, in crawl
    self.process([item[1]], 0)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1204, in process
    self.process_change(change, done, retry)
  File "/usr/libexec/glusterfs/python/syncdaemon/master.py", line 1143, in process_change
    st = lstat(go[0])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 553, in lstat
    return errno_wrap(os.lstat, [e], [ENOENT], [ESTALE, EBUSY])
  File "/usr/libexec/glusterfs/python/syncdaemon/syncdutils.py", line 535, in errno_wrap
    return call(*arg)
OSError: [Errno 107] Transport endpoint is not connected: '.gfid/5c143d64-165f-44b1-98ed-71e491376a76' [2018-10-25 07:08:59.627846] D [master(/gluster/owncloud/brick2):1665:Xcrawl] _GMaster: entering ./data/fa/files/backup/research/projects/2011-Regularity/resources [2018-10-25 07:08:59.632826] D [master(/gluster/owncloud/brick2):1665:Xcrawl] _GMaster: entering ./data/fa/files/backup/research/projects/2011-Regularity/add material [2018-10-25 07:08:59.633582] D [master(/gluster/owncloud/brick2):1665:Xcrawl] _GMaster: entering ./data/fa/files/backup/research/projects/2011-Regularity/add material/Maple [2018-10-25 07:08:59.636306] D [master(/gluster/owncloud/brick2):1665:Xcrawl] _GMaster: entering ./data/fa/files/backup/research/projects/2011-Regularity/add material/notes [2018-10-25 07:08:59.637303] I [syncdutils(/gluster/owncloud/brick2):271:finalize] <top>: exiting. [2018-10-25 07:08:59.640778] I [repce(/gluster/owncloud/brick2):92:service_loop] RepceServer: terminating on reaching EOF. [2018-10-25 07:08:59.641222] I [syncdutils(/gluster/owncloud/brick2):271:finalize] <top>: exiting. [2018-10-25 07:09:00.314140] I [monitor(monitor):363:monitor] Monitor: worker died in startup phase brick=/gluster/owncloud/brick2 [2018-10-25 07:09:00.315172] I [gsyncdstatus(monitor):243:set_worker_status] GeorepStatus: Worker Status Change status=Faulty

Slave Log

[2018-10-25 07:08:44.206372] I [resource(slave):1502:connect] GLUSTER: Mounting gluster volume locally... [2018-10-25 07:08:45.229620] I [resource(slave):1515:connect] GLUSTER: Mounted gluster volume   duration=1.0229 [2018-10-25 07:08:45.230180] I [resource(slave):1012:service_loop] GLUSTER: slave listening [2018-10-25 07:08:59.641242] I [repce(slave):92:service_loop] RepceServer: terminating on reaching EOF. [2018-10-25 07:08:59.655611] I [syncdutils(slave):271:finalize] <top>: exiting.

Volume Info

Volume Name: datacloud
Type: Replicate
Volume ID: 6cc79599-7a5c-4b02-bd86-13020a9d91db
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: 172.17.45.11:/gluster/datacloud/brick2
Brick2: 172.17.45.12:/gluster/datacloud/brick2
Brick3: 172.17.45.13:/gluster/datacloud/brick2 (arbiter)
Options Reconfigured:
cluster.server-quorum-type: server
cluster.shd-max-threads: 32
cluster.self-heal-readdir-size: 64KB
cluster.quorum-type: fixed
transport.address-family: inet
diagnostics.brick-log-level: INFO
changelog.capture-del-path: on
storage.build-pgfid: on
changelog.changelog: on
geo-replication.ignore-pid-check: on
server.statedump-path: /tmp/gluster
cluster.self-heal-window-size: 32
geo-replication.indexing: on
nfs.trusted-sync: off
diagnostics.dump-fd-stats: off
nfs.disable: on
cluster.self-heal-daemon: enable
cluster.background-self-heal-count: 16
cluster.heal-timeout: 120
cluster.data-self-heal-algorithm: full
cluster.consistent-metadata: on
network.ping-timeout: 20
cluster.granular-entry-heal: enable
cluster.server-quorum-ratio: 51%
cluster.enable-shared-storage: enable

Best regards,
Michael


--
Michael Roth  | michael.roth@xxxxxxxxxxxx
IT Solutions - Application Management
Technische Universität Wien - Operngasse 11, 1040 Wien
T +43-1-58801-42091

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux