Re: Brick missing trusted.glusterfs.dht xattr

Matthew Benstead <matthewb@xxxxxxx> · Mon, 29 Jul 2019 09:46:25 -0700

Hi Sunny,

Yes, I have attached the gsyncd.log file. I couldn't find any
changes-<brick-path>.log files...

Trying to start replication goes faulty right away:

[root@gluster01 ~]# rpm -q glusterfs
glusterfs-5.6-1.el7.x86_64
[root@gluster01 ~]# uname -r
3.10.0-957.21.3.el7.x86_64
[root@gluster01 ~]# cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)

[root@gluster01 ~]# gluster volume geo-replication storage
root@10.0.231.81::pcic-backup start
Starting geo-replication session between storage &
10.0.231.81::pcic-backup has been successful
[root@gluster01 ~]# gluster volume geo-replication storage
root@10.0.231.81::pcic-backup status

MASTER NODE    MASTER VOL    MASTER BRICK                  SLAVE USER   
SLAVE                       SLAVE NODE    STATUS    CRAWL STATUS   
LAST_SYNCED         
-------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.50    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.52    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.54    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.51    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.53    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.55    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
10.0.231.56    storage       /mnt/raid6-storage/storage    root         
10.0.231.81::pcic-backup    N/A           Faulty    N/A            
N/A                 
[root@gluster01 ~]# gluster volume geo-replication storage
root@10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage &
10.0.231.81::pcic-backup has been successful

This is the primary cluster:

[root@gluster01 ~]# gluster volume info storage

Volume Name: storage
Type: Distribute
Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
Status: Started
Snapshot Count: 0
Number of Bricks: 7
Transport-type: tcp
Bricks:
Brick1: 10.0.231.50:/mnt/raid6-storage/storage
Brick2: 10.0.231.51:/mnt/raid6-storage/storage
Brick3: 10.0.231.52:/mnt/raid6-storage/storage
Brick4: 10.0.231.53:/mnt/raid6-storage/storage
Brick5: 10.0.231.54:/mnt/raid6-storage/storage
Brick6: 10.0.231.55:/mnt/raid6-storage/storage
Brick7: 10.0.231.56:/mnt/raid6-storage/storage
Options Reconfigured:
features.read-only: off
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
nfs.disable: on
geo-replication.indexing: on
geo-replication.ignore-pid-check: on
transport.address-family: inet
features.quota-deem-statfs: on
changelog.changelog: on
diagnostics.client-log-level: INFO

And this is the cluster I'm trying to replicate to:

[root@pcic-backup01 ~]# gluster volume info pcic-backup

Volume Name: pcic-backup
Type: Distribute
Volume ID: 2890bcde-a023-4feb-a0e5-e8ef8f337d4c
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.0.231.81:/pcic-backup01-zpool/brick
Brick2: 10.0.231.82:/pcic-backup02-zpool/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet

Thanks,
 -Matthew

On 7/28/19 10:56 PM, Sunny Kumar wrote:
> HI Matthew,
>
> Can you share geo-rep logs and one more log file
> (changes-<brick-path>.log) it will help to pinpoint actual reason
> behind failure.
>
> /sunny
>
> On Mon, Jul 29, 2019 at 9:13 AM Nithya Balachandran <nbalacha@xxxxxxxxxx> wrote:
>>
>>
>> On Sat, 27 Jul 2019 at 02:31, Matthew Benstead <matthewb@xxxxxxx> wrote:
>>> Ok thank-you for explaining everything - that makes sense.
>>>
>>> Currently the brick file systems are pretty evenly distributed so I probably won't run the fix-layout right now.
>>>
>>> Would this state have any impact on geo-replication? I'm trying to geo-replicate this volume, but am getting a weird error: "Changelog register failed error=[Errno 21] Is a directory"
>>
>> It should not. Sunny, can you comment on this?
>>
>> Regards,
>> Nithya
>>>
>>> I assume this is related to something else, but I wasn't sure.
>>>
>>> Thanks,
>>>  -Matthew
>>>
>>> --
>>> Matthew Benstead
>>> System Administrator
>>> Pacific Climate Impacts Consortium
>>> University of Victoria, UH1
>>> PO Box 1800, STN CSC
>>> Victoria, BC, V8W 2Y2
>>> Phone: +1-250-721-8432
>>> Email: matthewb@xxxxxxx
>>>
>>> On 7/26/19 12:02 AM, Nithya Balachandran wrote:
>>>
>>>
>>>
>>> On Fri, 26 Jul 2019 at 01:56, Matthew Benstead <matthewb@xxxxxxx> wrote:
>>>> Hi Nithya,
>>>>
>>>> Hmm... I don't remember if I did, but based on what I'm seeing it sounds like I probably didn't run rebalance or fix-layout.
>>>>
>>>> It looks like folders that haven't had any new files created have a dht of 0, while other folders have non-zero values.
>>>>
>>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex /mnt/raid6-storage/storage/ | grep dht
>>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex /mnt/raid6-storage/storage/home | grep dht
>>>> trusted.glusterfs.dht=0x00000000000000000000000000000000
>>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex /mnt/raid6-storage/storage/home/matthewb | grep dht
>>>> trusted.glusterfs.dht=0x00000001000000004924921a6db6dbc7
>>>>
>>>> If I just run the fix-layout command will it re-create all of the dht values or just the missing ones?
>>>
>>> A fix-layout will recalculate the layouts entirely so files all the values will change. No files will be moved.
>>> A rebalance will recalculate the layouts like the fix-layout but will also move files to their new locations based on the new layout ranges. This could take a lot of time depending on the number of files/directories on the volume. If you do this, I would recommend that you turn off lookup-optimize until the rebalance is over.
>>>
>>>> Since the brick is already fairly size balanced could I get away with running fix-layout but not rebalance? Or would the new dht layout mean slower accesses since the files may be expected on different bricks?
>>>
>>> The first access for a file will be slower. The next one will be faster as the location will be cached in the client's in-memory structures.
>>> You may not need to run either a fix-layout or a rebalance if new file creations will be in directories created after the add-brick. Gluster will automatically include all 7 bricks for those directories.
>>>
>>> Regards,
>>> Nithya
>>>
>>>> Thanks,
>>>>  -Matthew
>>>>
>>>> --
>>>> Matthew Benstead
>>>> System Administrator
>>>> Pacific Climate Impacts Consortium
>>>> University of Victoria, UH1
>>>> PO Box 1800, STN CSC
>>>> Victoria, BC, V8W 2Y2
>>>> Phone: +1-250-721-8432
>>>> Email: matthewb@xxxxxxx
>>>>
>>>> On 7/24/19 9:30 PM, Nithya Balachandran wrote:
>>>>
>>>>
>>>>
>>>> On Wed, 24 Jul 2019 at 22:12, Matthew Benstead <matthewb@xxxxxxx> wrote:
>>>>> So looking more closely at the trusted.glusterfs.dht attributes from the bricks it looks like they cover the entire range... and there is no range left for gluster07.
>>>>>
>>>>> The first 6 bricks range from 0x00000000 to 0xffffffff - so... is there a way to re-calculate what the dht values should be? Each of the bricks should have a gap
>>>>>
>>>>> Gluster05 00000000 -> 2aaaaaa9
>>>>> Gluster06 2aaaaaaa -> 55555553
>>>>> Gluster01 55555554 -> 7ffffffd
>>>>> Gluster02 7ffffffe -> aaaaaaa7
>>>>> Gluster03 aaaaaaa8 -> d5555551
>>>>> Gluster04 d5555552 -> ffffffff
>>>>> Gluster07 None
>>>>>
>>>>> If we split the range into 7 servers that would be a gap of about 0x24924924 for each server.
>>>>>
>>>>> Now in terms of the gluster07 brick, about 2 years ago the RAID array the brick was stored on became corrupted. I ran the remove-brick force command, then provisioned a new server, ran the add-brick command and then restored the missing files from backup by copying them back to the main gluster mount (not the brick).
>>>>>
>>>> Did you run a rebalance after performing the add-brick? Without a rebalance/fix-layout , the layout for existing directories on the volume will not  be updated to use the new brick as well.
>>>>
>>>> That the layout does not include the new brick in the root dir is in itself is not a problem. Do you create a lot of files directly in the root of the volume? If yes, you might want to run a rebalance. Otherwise, if you mostly create files in newly added directories, you can probably ignore this. You can check the layout for directories on the volume and see if they incorporate the brick7.
>>>>
>>>> I would expect a lookup on the root to have set an xattr on the brick with an empty layout range . The fact that the xattr does not exist at all on the brick is what I am looking into.
>>>>
>>>>
>>>>> It looks like prior to that event this was the layout - which would make sense given the equal size of the 7 bricks:
>>>>>
>>>>> gluster02.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f
>>>>>
>>>>> gluster05.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f
>>>>>
>>>>> gluster04.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf
>>>>>
>>>>> gluster03.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f
>>>>>
>>>>> gluster01.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f
>>>>>
>>>>> gluster07.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x000000010000000000000000245fffdf
>>>>>
>>>>> gluster06.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff
>>>>>
>>>>> Which yields the following:
>>>>>
>>>>> 00000000 -> 245fffdf    Gluster07
>>>>> 245fffe0 -> 48bfff1f    Gluster01
>>>>> 48bfff20 -> 6d1ffe5f    Gluster02
>>>>> 6d1ffe60 -> 917ffd9f    Gluster03
>>>>> 917ffda0 -> b5dffcdf    Gluster04
>>>>> b5dffce0 -> da3ffc1f    Gluster05
>>>>> da3ffc20 -> ffffffff    Gluster06
>>>>>
>>>>> Is there some way to get back to this?
>>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>>
>>>>> --
>>>>> Matthew Benstead
>>>>> System Administrator
>>>>> Pacific Climate Impacts Consortium
>>>>> University of Victoria, UH1
>>>>> PO Box 1800, STN CSC
>>>>> Victoria, BC, V8W 2Y2
>>>>> Phone: +1-250-721-8432
>>>>> Email: matthewb@xxxxxxx
>>>>>
>>>>> On 7/18/19 7:20 AM, Matthew Benstead wrote:
>>>>>
>>>>> Hi Nithya,
>>>>>
>>>>> No - it was added about a year and a half ago. I have tried re-mounting the volume on the server, but it didn't add the attr:
>>>>>
>>>>> [root@gluster07 ~]# umount /storage/
>>>>> [root@gluster07 ~]# cat /etc/fstab | grep "/storage"
>>>>> 10.0.231.56:/storage /storage glusterfs defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0
>>>>> [root@gluster07 ~]# mount /storage/
>>>>> [root@gluster07 ~]# df -h /storage/
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> 10.0.231.56:/storage  255T  194T   62T  77% /storage
>>>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex /mnt/raid6-storage/storage/
>>>>> # file: /mnt/raid6-storage/storage/
>>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0
>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>> trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53
>>>>> trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>>
>>>>> On 7/17/19 10:04 PM, Nithya Balachandran wrote:
>>>>>
>>>>> Hi Matthew,
>>>>>
>>>>> Was this node/brick added to the volume recently? If yes, try mounting the volume on a fresh mount point - that should create the xattr on this as well.
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>> On Wed, 17 Jul 2019 at 21:01, Matthew Benstead <matthewb@xxxxxxx> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I've just noticed one brick in my 7 node distribute volume is missing
>>>>>> the trusted.glusterfs.dht xattr...? How can I fix this?
>>>>>>
>>>>>> I'm running glusterfs-5.3-2.el7.x86_64 on CentOS 7.
>>>>>>
>>>>>> All of the other nodes are fine, but gluster07 from the list below does
>>>>>> not have the attribute.
>>>>>>
>>>>>> $ ansible -i hosts gluster-servers[0:6] ... -m shell -a "getfattr -m .
>>>>>> --absolute-names -n trusted.glusterfs.dht -e hex
>>>>>> /mnt/raid6-storage/storage"
>>>>>> ...
>>>>>> gluster05 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
>>>>>>
>>>>>> gluster03 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
>>>>>>
>>>>>> gluster04 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff
>>>>>>
>>>>>> gluster06 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553
>>>>>>
>>>>>> gluster02 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7
>>>>>>
>>>>>> gluster07 | FAILED | rc=1 >>
>>>>>> /mnt/raid6-storage/storage: trusted.glusterfs.dht: No such
>>>>>> attributenon-zero return code
>>>>>>
>>>>>> gluster01 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>> trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
>>>>>>
>>>>>> Here are all of the attr's from the brick:
>>>>>>
>>>>>> [root@gluster07 ~]# getfattr --absolute-names -m . -d -e hex
>>>>>> /mnt/raid6-storage/storage/
>>>>>> # file: /mnt/raid6-storage/storage/
>>>>>> security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>> trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9
>>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>> trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03
>>>>>> trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>>
>>>>>>
>>>>>> And here is the volume information:
>>>>>>
>>>>>> [root@gluster07 ~]# gluster volume info storage
>>>>>>
>>>>>> Volume Name: storage
>>>>>> Type: Distribute
>>>>>> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 7
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
>>>>>> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
>>>>>> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
>>>>>> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
>>>>>> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
>>>>>> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
>>>>>> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
>>>>>> Options Reconfigured:
>>>>>> changelog.changelog: on
>>>>>> features.quota-deem-statfs: on
>>>>>> features.read-only: off
>>>>>> features.inode-quota: on
>>>>>> features.quota: on
>>>>>> performance.readdir-ahead: on
>>>>>> nfs.disable: on
>>>>>> geo-replication.indexing: on
>>>>>> geo-replication.ignore-pid-check: on
>>>>>> transport.address-family: inet
>>>>>>
>>>>>> Thanks,
>>>>>>  -Matthew
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users@xxxxxxxxxxx
>>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>

[2019-07-29 16:19:30.795255] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:19:30.913547] I [gsyncd(status):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:21:43.801510] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:21:43.917597] I [gsyncd(status):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:25:48.446841] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:25:48.621459] I [gsyncd(status):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:27:59.304338] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:27:59.420490] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:27:59.881472] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:00.25814] I [gsyncd(monitor):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:01.223469] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing...
[2019-07-29 16:28:01.223880] I [monitor(monitor):157:monitor] Monitor: starting gsyncd worker   brick=/mnt/raid6-storage/storage        slave_node=10.0.231.81
[2019-07-29 16:28:01.312726] I [gsyncd(agent /mnt/raid6-storage/storage):308:main] <top>: Using session config file     path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:01.314676] I [changelogagent(agent /mnt/raid6-storage/storage):72:__init__] ChangelogAgent: Agent listining...
[2019-07-29 16:28:01.316977] I [gsyncd(worker /mnt/raid6-storage/storage):308:main] <top>: Using session config file    path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:01.336112] I [resource(worker /mnt/raid6-storage/storage):1366:connect_remote] SSH: Initializing SSH connection between master and slave...
[2019-07-29 16:28:03.150192] I [resource(worker /mnt/raid6-storage/storage):1413:connect_remote] SSH: SSH connection between master and slave established.      duration=1.8138
[2019-07-29 16:28:03.150571] I [resource(worker /mnt/raid6-storage/storage):1085:connect] GLUSTER: Mounting gluster volume locally...
[2019-07-29 16:28:04.320101] I [resource(worker /mnt/raid6-storage/storage):1108:connect] GLUSTER: Mounted gluster volume       duration=1.1692
[2019-07-29 16:28:04.320516] I [subcmds(worker /mnt/raid6-storage/storage):80:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2019-07-29 16:28:04.342652] E [repce(agent /mnt/raid6-storage/storage):122:worker] <top>: call failed: 
Traceback (most recent call last):
  File "/usr/libexec/glusterfs/python/syncdaemon/repce.py", line 118, in worker
    res = getattr(self.obj, rmeth)(*in_data[2:])
  File "/usr/libexec/glusterfs/python/syncdaemon/changelogagent.py", line 40, in register
    return Changes.cl_register(cl_brick, cl_dir, cl_log, cl_level, retries)
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 45, in cl_register
    cls.raise_changelog_err()
  File "/usr/libexec/glusterfs/python/syncdaemon/libgfchangelog.py", line 29, in raise_changelog_err
    raise ChangelogException(errn, os.strerror(errn))
ChangelogException: [Errno 21] Is a directory
[2019-07-29 16:28:04.344322] E [repce(worker /mnt/raid6-storage/storage):214:__call__] RepceClient: call failed call=7597:139696847746880:1564417684.34 method=register error=ChangelogException
[2019-07-29 16:28:04.344624] E [resource(worker /mnt/raid6-storage/storage):1266:service_loop] GLUSTER: Changelog register failed       error=[Errno 21] Is a directory
[2019-07-29 16:28:04.367947] I [repce(agent /mnt/raid6-storage/storage):97:service_loop] RepceServer: terminating on reaching EOF.
[2019-07-29 16:28:05.322334] I [monitor(monitor):278:monitor] Monitor: worker died in startup phase     brick=/mnt/raid6-storage/storage
[2019-07-29 16:28:05.329349] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Faulty
[2019-07-29 16:28:12.136109] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:12.251986] I [gsyncd(status):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:15.346446] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Initializing...
[2019-07-29 16:28:15.346721] I [monitor(monitor):157:monitor] Monitor: starting gsyncd worker   brick=/mnt/raid6-storage/storage        slave_node=10.0.231.81
[2019-07-29 16:28:15.439521] I [gsyncd(agent /mnt/raid6-storage/storage):308:main] <top>: Using session config file     path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:15.441205] I [changelogagent(agent /mnt/raid6-storage/storage):72:__init__] ChangelogAgent: Agent listining...
[2019-07-29 16:28:15.442174] I [gsyncd(worker /mnt/raid6-storage/storage):308:main] <top>: Using session config file    path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:15.461399] I [resource(worker /mnt/raid6-storage/storage):1366:connect_remote] SSH: Initializing SSH connection between master and slave...
[2019-07-29 16:28:17.209469] I [resource(worker /mnt/raid6-storage/storage):1413:connect_remote] SSH: SSH connection between master and slave established.      duration=1.7478
[2019-07-29 16:28:17.209813] I [resource(worker /mnt/raid6-storage/storage):1085:connect] GLUSTER: Mounting gluster volume locally...
[2019-07-29 16:28:17.821201] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:17.897001] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:18.265443] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:18.334799] I [gsyncd(config-get):308:main] <top>: Using session config file   path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:18.339283] E [syncdutils(worker /mnt/raid6-storage/storage):311:log_raise_exception] <top>: connection to peer is broken
[2019-07-29 16:28:18.342001] E [syncdutils(worker /mnt/raid6-storage/storage):809:errlog] Popen: command returned error cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-m3oiNg/3e2b94422c2f28333f9437f059a5a934.sock 10.0.231.81 /nonexistent/gsyncd slave storage 10.0.231.81::pcic-backup --master-node 10.0.231.50 --master-node-id a965e782-39e2-41cc-a0d1-b32ecccdcd2f --master-brick /mnt/raid6-storage/storage --local-node 10.0.231.81 --local-node-id 90627fcb-a89a-4f6d-94ba-3b99d73ab6ac --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin    error=255
[2019-07-29 16:28:18.342448] E [syncdutils(worker /mnt/raid6-storage/storage):813:logerr] Popen: ssh> Killed by signal 15.
[2019-07-29 16:28:18.367300] I [repce(agent /mnt/raid6-storage/storage):97:service_loop] RepceServer: terminating on reaching EOF.
[2019-07-29 16:28:18.367609] I [monitor(monitor):268:monitor] Monitor: worker died before establishing connection       brick=/mnt/raid6-storage/storage
[2019-07-29 16:28:18.376351] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change status=Faulty
[2019-07-29 16:28:19.435971] I [gsyncd(monitor-status):308:main] <top>: Using session config file       path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf
[2019-07-29 16:28:19.452789] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change  status=Stopped

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users