Re: GeoRep Faulty after Gluster 7 to 8 upgrade - gfchangelog: wrong result

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



 @Kaleb Keithley @Kotresh Hiremath Ravishankar  any insights on use of rpm scriplets for this script? Basically this script is to be run before the upgradation.

Strahil,
I will update the docs once we decide with respect to rpm scriptlet part.

Regrads,
Shwetha  


On Fri, Mar 19, 2021 at 5:45 PM Strahil Nikolov <hunter86_bg@xxxxxxxxx> wrote:
Hi Shwetha,

Is that script mentioned in the documentation (release notes for example) ?

Do you think that the new glusterfs rpm can call (via rpm scriplets) the script safely ?


If yes, then it will be nice if the rpm itself runs the script instead of the user.


Best Regards,
Strahil Nikolov

On Fri, Mar 19, 2021 at 11:44, Shwetha Acharya
Hi Mathew,

The upgrade script https://github.com/gluster/glusterfs/commit/2857fe3fad4d2b30894847088a54b847b88a23b9 need to be run, to make sure that the changelogs and htimes files created in the older version are updated as per the new changes in the directory structure.

If not done, search algorithm during history crawl would give wrong result.

Regards,
Shwetha

On Fri, Mar 19, 2021 at 3:53 AM Strahil Nikolov <hunter86_bg@xxxxxxxxx> wrote:
Sadly,

I'm out of ideas. It makes sense... if the changelog was changed - then it won't work after the upgrade.

I guess that once you delete the session, you can remove the extended attribute for the time.

Best Regards,
Strahil Nikolov

On Wed, Mar 17, 2021 at 23:11, Matthew Benstead
Yes, I've run through everything short of regenerating the keys and creating the session again with no errors. Everything looks ok.

But I did notice that the changelog format had changed, instead of them being dumped into one directory, they now seem to be separated in year/month/day directories...

Looks like this change in 8.0: https://github.com/gluster/glusterfs/issues/154

[root@storage01 changelogs]# ls -lh | head
total 16G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 17 13:19 CHANGELOG
-rw-r--r--. 1 root root  13K Aug 13  2020 CHANGELOG.1597343197
-rw-r--r--. 1 root root  51K Aug 13  2020 CHANGELOG.1597343212
-rw-r--r--. 1 root root  86K Aug 13  2020 CHANGELOG.1597343227
-rw-r--r--. 1 root root  99K Aug 13  2020 CHANGELOG.1597343242
-rw-r--r--. 1 root root  69K Aug 13  2020 CHANGELOG.1597343257
-rw-r--r--. 1 root root  69K Aug 13  2020 CHANGELOG.1597343272
-rw-r--r--. 1 root root  72K Aug 13  2020 CHANGELOG.1597343287
[root@storage01 changelogs]# ls -lh | tail
-rw-r--r--. 1 root root   92 Mar  1 21:33 CHANGELOG.1614663193
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663731
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663760
-rw-r--r--. 1 root root  511 Mar  1 21:47 CHANGELOG.1614664043
-rw-r--r--. 1 root root  536 Mar  1 21:48 CHANGELOG.1614664101
-rw-r--r--. 1 root root 2.8K Mar  1 21:48 CHANGELOG.1614664116
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666061
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666554
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime


[root@storage01 changelogs]# ls -lh 2021/03/09/
total 8.0K
-rw-r--r--. 1 root root 51 Mar  9 11:26 CHANGELOG.1615318474
-rw-r--r--. 1 root root 51 Mar  9 12:19 CHANGELOG.1615321197
[root@storage01 changelogs]# ls -lh 2021/03/15/
total 4.0K
-rw-r--r--. 1 root root 51 Mar 15 13:38 CHANGELOG.1615842490
[root@storage01 changelogs]# ls -lh 2021/03/16
total 4.0K
-rw-r--r--. 1 root root 331 Mar 16 12:04 CHANGELOG.1615921482


But it looks like the htime file still records them...

[root@storage01 changelogs]# ls -lh htime
total 84M
-rw-r--r--. 1 root root 84M Mar 17 13:31 HTIME.1597342860

[root@storage01 changelogs]# head -c 256 htime/HTIME.1597342860
/data/storage_a/storage/.glusterfs/changelogs/changelog.1597342875/data/storage_a/storage/.glusterfs/changelogs/changelog.1597342890/data/storage_a/storage/.glusterfs/changelogs/changelog.1597342904/data/storage_a/storage/.glusterfs/changelogs/changelog[root@storage01 changelogs]#

[root@storage01 changelogs]# tail -c 256 htime/HTIME.1597342860
/changelog.1616013484/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013499/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013514/data/storage_a/storage/.glusterfs/changelogs/2021/03/17/changelog.1616013529[root@storage01 changelogs]#



And there seems to be an xattr for time in the brick root - presumably for when changelogs were enabled:

[root@storage01 changelogs]# getfattr -d -m. -e hex /data/storage_a/storage 2>&1 | egrep xtime
trusted.glusterfs.cf94a8f2-324b-40b3-bf72-c3766100ea99.xtime=0x60510140000ef317



Reading through the changes, it looks like there is a script to convert from one format to the other... I didn't see anything in the release notes for 8.0 about it... this seems like it could be the fix, and explain why gluster can't get through the changelogs....  Thoughts?

Thanks,
 -Matthew

--
Matthew Benstead
System Administrator
Pacific Climate Impacts Consortium
University of Victoria, UH1
PO Box 1800, STN CSC
Victoria, BC, V8W 2Y2
Phone: +1-250-721-8432
Email: matthewb@xxxxxxx

On 3/16/21 9:36 PM, Strahil Nikolov wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Have you verified all steps for creating the geo-replication ?

If yes , maybe using "reset-sync-time + delete + create" makes sense.Keep in mind that it will take a long time once the geo-rep is established again.


Best Regards,
Strahil Nikolov

On Tue, Mar 16, 2021 at 22:34, Matthew Benstead
Thanks Strahil,

I wanted to make sure the issue wasn't occurring because there were no new changes to sync from the master volume. So I created some files and restarted the sync, but it had no effect.

[root@storage01 ~]# cd /storage2/home/test/
[root@storage01 test]# for nums in {1,2,3,4,5,6,7,8,9,0}; do touch $nums.txt; done

[root@storage01 test]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup start
Starting geo-replication session between storage & geoaccount@10.0.231.81::pcic-backup has been successful
[root@storage01 test]# gluster volume geo-replication status
 
MASTER NODE    MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                        SLAVE NODE    STATUS             CRAWL STATUS    LAST_SYNCED         
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.91    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.91    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
[root@storage01 test]# gluster volume geo-replication status
 
MASTER NODE    MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                        SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED         
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.91    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.91    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
[root@storage01 test]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage & geoaccount@10.0.231.81::pcic-backup has been successful

Still getting the same error about the history crawl failing:

[2021-03-16 19:05:05.227677] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615921505}]
[2021-03-16 19:05:05.227733] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615921502}, {total_changelogs=1300114}]
[2021-03-16 19:05:05.408567] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for="" {start=1615921502}, {idx=1300113}]


[2021-03-16 19:05:05.228092] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615921505}]
[2021-03-16 19:05:05.228626] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 124117:140500837320448:1615921505.23 keep_alive(None,) ...
[2021-03-16 19:05:05.230076] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 124117:140500837320448:1615921505.23 keep_alive -> 1
[2021-03-16 19:05:05.230693] D [master(worker /data/storage_c/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ...
[2021-03-16 19:05:05.237607] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-16 19:05:05.242046] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-16 19:05:05.242450] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615921505}]
[2021-03-16 19:05:05.244151] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-16 19:05:05.394129] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-16 19:05:05.408759] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-16 19:05:06.158694] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}]
[2021-03-16 19:05:06.163052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-16 19:05:06.204464] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}]
[2021-03-16 19:05:06.208961] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-16 19:05:06.220495] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}]
[2021-03-16 19:05:06.223947] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]

I confirmed NTP is working:


pcic-backup02 | CHANGED | rc=0 >>
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+s216-232-132-95 68.69.221.61     2 u   29 1024  377   24.141    2.457   1.081
*yyz-1.ip.0xt.ca 206.108.0.131    2 u  257 1024  377   57.119   -0.084   5.625
+ip102.ip-198-27 192.168.10.254   2 u  189 1024  377   64.227   -3.012   8.867

storage03 | CHANGED | rc=0 >>
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*198.161.203.36  128.233.150.93   2 u   36 1024  377   16.055   -0.381   0.318
+s206-75-147-25. 192.168.10.254   2 u  528 1024  377   23.648   -6.196   4.803
+time.cloudflare 10.69.8.80       3 u  121 1024  377    2.408    0.507   0.791

storage02 | CHANGED | rc=0 >>
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*198.161.203.36  128.233.150.93   2 u  918 1024  377   15.952    0.226   0.197
+linuxgeneration 16.164.40.197    2 u   88 1024  377   62.692   -1.160   2.007
+dns3.switch.ca  206.108.0.131    2 u  857 1024  377   27.315    0.778   0.483

storage01 | CHANGED | rc=0 >>
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
+198.161.203.36  128.233.150.93   2 u  121 1024  377   16.069    1.016   0.195
+zero.gotroot.ca 30.114.5.31      2 u  543 1024  377    5.106   -2.462   4.923
*ntp3.torix.ca   .PTP0.           1 u  300 1024  377   54.010    2.421  15.182

pcic-backup01 | CHANGED | rc=0 >>
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
*dns3.switch.ca  206.108.0.131    2 u  983 1024  377   26.990    0.523   1.389
+dns2.switch.ca  206.108.0.131    2 u  689 1024  377   26.975   -0.257   0.467
+64.ip-54-39-23. 214.176.184.39   2 u  909 1024  377   64.262   -0.604   6.129

And everything is working on the same version of gluster:

pcic-backup02 | CHANGED | rc=0 >>
glusterfs 8.3
pcic-backup01 | CHANGED | rc=0 >>
glusterfs 8.3
storage02 | CHANGED | rc=0 >>
glusterfs 8.3
storage01 | CHANGED | rc=0 >>
glusterfs 8.3
storage03 | CHANGED | rc=0 >>
glusterfs 8.3

SSH works, and the backup user/group is configured with mountbroker:

[root@storage01 ~]# ssh -i /root/.ssh/id_rsa geoaccount@10.0.231.81 uname -a
Linux pcic-backup01 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@storage01 ~]# ssh -i /root/.ssh/id_rsa geoaccount@10.0.231.82 uname -a
Linux pcic-backup02 3.10.0-1160.15.2.el7.x86_64 #1 SMP Wed Feb 3 15:06:38 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


[root@pcic-backup01 ~]# grep geo /etc/passwd
geoaccount:x:1000:1000::/home/geoaccount:/bin/bash
[root@pcic-backup01 ~]# grep geo /etc/group
geogroup:x:1000:geoaccount
geoaccount:x:1001:geoaccount

[root@pcic-backup01 ~]# gluster-mountbroker status
+-------------+-------------+---------------------------+--------------+--------------------------+
|     NODE    | NODE STATUS |         MOUNT ROOT        |    GROUP     |          USERS           |
+-------------+-------------+---------------------------+--------------+--------------------------+
| 10.0.231.82 |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
|  localhost  |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
+-------------+-------------+---------------------------+--------------+--------------------------+




So, then if I'm going to have to resync, what is the best way to do this?

With delete or delete reset-sync-time ?   https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-starting_geo-replication#Deleting_a_Geo-replication_Session



Erasing the index? So I don't have to transfer the files again that are already on the backup?



Is it possible to use the special-sync-mode  option from here: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.5/html/administration_guide/sect-disaster_recovery



Thoughts?

Thanks,
 -Matthew
--

On 3/12/21 3:31 PM, Strahil Nikolov wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Usually, when I'm stuck - I just start over.
For example, check the prerequisites:
- Is ssh available (no firewall blocking)
- Is time sync enabled (ntp/chrony)
- Is DNS ok on all hosts (including PTR records)
- Is the gluster version the same on all nodes (primary & secondary)

Then start over as if the geo rep was never existing. For example , stop it and start over with the secondary nodes's checks (mountbroker, user, group) .

Most probably somwthing will come up and you will fix it.

In worst case scenario, you will need to clean ip the geo-rep and start fresh.


Best Regards,
Strahil Nikolov

On Fri, Mar 12, 2021 at 20:01, Matthew Benstead
Hi Strahil,

Yes, SELinux was put into permissive mode on the secondary nodes as well:

[root@pcic-backup01 ~]# sestatus | egrep -i  "^SELinux status|mode"
SELinux status:                 enabled
Current mode:                   permissive
Mode from config file:          enforcing

[root@pcic-backup02 ~]# sestatus | egrep -i  "^SELinux status|mode"
SELinux status:                 enabled
Current mode:                   permissive
Mode from config file:          enforcing

The secondary server logs didn't show anything interesting:

gsyncd.log:

[2021-03-11 19:15:28.81820] I [resource(slave 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:28.101819] I [resource(slave 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:28.107012] I [resource(slave 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:28.124567] I [resource(slave 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:28.128145] I [resource(slave 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:29.425739] I [resource(slave 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3184}]
[2021-03-11 19:15:29.427448] I [resource(slave 10.0.231.93/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:29.433340] I [resource(slave 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3083}]
[2021-03-11 19:15:29.434452] I [resource(slave 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3321}]
[2021-03-11 19:15:29.434314] I [resource(slave 10.0.231.93/data/storage_b/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:29.435575] I [resource(slave 10.0.231.91/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:29.439769] I [resource(slave 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3576}]
[2021-03-11 19:15:29.440998] I [resource(slave 10.0.231.92/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:29.454745] I [resource(slave 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3262}]
[2021-03-11 19:15:29.456192] I [resource(slave 10.0.231.93/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:32.594865] I [repce(slave 10.0.231.92/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:32.607815] I [repce(slave 10.0.231.93/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:32.647663] I [repce(slave 10.0.231.93/data/storage_b/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:32.656280] I [repce(slave 10.0.231.91/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:32.668299] I [repce(slave 10.0.231.93/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:44.260689] I [resource(slave 10.0.231.92/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:44.271457] I [resource(slave 10.0.231.93/data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:44.271883] I [resource(slave 10.0.231.93/data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:44.279670] I [resource(slave 10.0.231.91/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:44.284261] I [resource(slave 10.0.231.93/data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:45.614280] I [resource(slave 10.0.231.93/data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3419}]
[2021-03-11 19:15:45.615622] I [resource(slave 10.0.231.93/data/storage_b/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:45.617986] I [resource(slave 10.0.231.93/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3461}]
[2021-03-11 19:15:45.618180] I [resource(slave 10.0.231.91/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3380}]
[2021-03-11 19:15:45.619539] I [resource(slave 10.0.231.91/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:45.618999] I [resource(slave 10.0.231.93/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:45.620843] I [resource(slave 10.0.231.93/data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3361}]
[2021-03-11 19:15:45.621347] I [resource(slave 10.0.231.92/data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.3604}]
[2021-03-11 19:15:45.622179] I [resource(slave 10.0.231.93/data/storage_a/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:45.622541] I [resource(slave 10.0.231.92/data/storage_c/storage):1166:service_loop] GLUSTER: slave listening
[2021-03-11 19:15:47.626054] I [repce(slave 10.0.231.91/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:48.778399] I [repce(slave 10.0.231.93/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:48.778491] I [repce(slave 10.0.231.92/data/storage_c/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:48.796854] I [repce(slave 10.0.231.93/data/storage_a/storage):96:service_loop] RepceServer: terminating on reaching EOF.
[2021-03-11 19:15:48.800697] I [repce(slave 10.0.231.93/data/storage_b/storage):96:service_loop] RepceServer: terminating on reaching EOF.

The mnt geo-rep files were also uninteresting:
[2021-03-11 19:15:28.250150] I [MSGID: 100030] [glusterfsd.c:2689:main] 0-/usr/sbin/glusterfs: Started running version [{arg=/usr/sbin/glusterfs}, {version=8.3}, {cmdlinestr=/usr/sbin/glusterfs --user-map-root=g
eoaccount --aux-gfid-mount --acl --log-level=INFO --log-file=/var/log/glusterfs/geo-replication-slaves/storage_10.0.231.81_pcic-backup/mnt-10.0.231.93-data-storage_b-storage.log --volfile-server=localhost --volf
ile-id=pcic-backup --client-pid=-1 /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI}]
[2021-03-11 19:15:28.253485] I [glusterfsd.c:2424:daemonize] 0-glusterfs: Pid of current running process is 157484
[2021-03-11 19:15:28.267911] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}]
[2021-03-11 19:15:28.267984] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2021-03-11 19:15:28.268371] I [glusterfsd-mgmt.c:2170:mgmt_getspec_cbk] 0-glusterfs: Received list of available volfile servers: 10.0.231.82:24007
[2021-03-11 19:15:28.271729] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-11 19:15:28.271762] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-11 19:15:28.272223] I [MSGID: 114020] [client.c:2315:notify] 0-pcic-backup-client-0: parent translators are ready, attempting connect on transport []
[2021-03-11 19:15:28.275883] I [MSGID: 114020] [client.c:2315:notify] 0-pcic-backup-client-1: parent translators are ready, attempting connect on transport []
[2021-03-11 19:15:28.276154] I [rpc-clnt.c:1975:rpc_clnt_reconfig] 0-pcic-backup-client-0: changing port to 49153 (from 0)
[2021-03-11 19:15:28.276193] I [socket.c:849:__socket_shutdown] 0-pcic-backup-client-0: intentional socket shutdown(13)
Final graph:
...
+------------------------------------------------------------------------------+
[2021-03-11 19:15:28.282144] I [socket.c:849:__socket_shutdown] 0-pcic-backup-client-1: intentional socket shutdown(15)
[2021-03-11 19:15:28.286536] I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 0-pcic-backup-client-0: Using Program [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}]
[2021-03-11 19:15:28.287208] I [MSGID: 114046] [client-handshake.c:857:client_setvolume_cbk] 0-pcic-backup-client-0: Connected, attached to remote volume [{conn-name=pcic-backup-client-0}, {remote_subvol=/data/brick}]
[2021-03-11 19:15:28.290162] I [MSGID: 114057] [client-handshake.c:1128:select_server_supported_programs] 0-pcic-backup-client-1: Using Program [{Program-name=GlusterFS 4.x v1}, {Num=1298437}, {Version=400}]
[2021-03-11 19:15:28.291122] I [MSGID: 114046] [client-handshake.c:857:client_setvolume_cbk] 0-pcic-backup-client-1: Connected, attached to remote volume [{conn-name=pcic-backup-client-1}, {remote_subvol=/data/brick}]
[2021-03-11 19:15:28.292703] I [fuse-bridge.c:5300:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.24 kernel 7.23
[2021-03-11 19:15:28.292730] I [fuse-bridge.c:5926:fuse_graph_sync] 0-fuse: switched to graph 0
[2021-03-11 19:15:32.809518] I [fuse-bridge.c:6242:fuse_thread_proc] 0-fuse: initiating unmount of /var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI
[2021-03-11 19:15:32.810216] W [glusterfsd.c:1439:cleanup_and_exit] (-->/lib64/libpthread.so.0(+0x7ea5) [0x7ff56b175ea5] -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55664e67db45] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55664e67d9ab] ) 0-: received signum (15), shutting down
[2021-03-11 19:15:32.810253] I [fuse-bridge.c:7074:fini] 0-fuse: Unmounting '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'.
[2021-03-11 19:15:32.810268] I [fuse-bridge.c:7079:fini] 0-fuse: Closing fuse connection to '/var/mountbroker-root/user1000/mtpt-geoaccount-GmVoUI'.


I'm really at a loss for where to go from here, it seems like everything is set up correctly, and it has been working well through the 7.x minor versions, but the jump to 8 has broken something...

There definitely are lots of changelogs on the servers that fit into the timeframe..... I haven't made any writes to the source volume.... do you think that's the problem? That it needs some new changelog info to sync?
I had been holding off making any writes in case I needed to go back to Gluster7.9 - not sure if that's really a good option or not.

[root@storage01 changelogs]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
total 16G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root  13K Aug 13  2020 CHANGELOG.1597343197
-rw-r--r--. 1 root root  51K Aug 13  2020 CHANGELOG.1597343212
-rw-r--r--. 1 root root  86K Aug 13  2020 CHANGELOG.1597343227
-rw-r--r--. 1 root root  99K Aug 13  2020 CHANGELOG.1597343242
-rw-r--r--. 1 root root  69K Aug 13  2020 CHANGELOG.1597343257
-rw-r--r--. 1 root root  69K Aug 13  2020 CHANGELOG.1597343272
-rw-r--r--. 1 root root  72K Aug 13  2020 CHANGELOG.1597343287

/data/storage_b/storage/.glusterfs/changelogs
total 3.3G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root  13K Aug 13  2020 CHANGELOG.1597343197
-rw-r--r--. 1 root root  53K Aug 13  2020 CHANGELOG.1597343212
-rw-r--r--. 1 root root  89K Aug 13  2020 CHANGELOG.1597343227
-rw-r--r--. 1 root root  89K Aug 13  2020 CHANGELOG.1597343242
-rw-r--r--. 1 root root  69K Aug 13  2020 CHANGELOG.1597343257
-rw-r--r--. 1 root root  71K Aug 13  2020 CHANGELOG.1597343272
-rw-r--r--. 1 root root  86K Aug 13  2020 CHANGELOG.1597343287

/data/storage_c/storage/.glusterfs/changelogs
total 9.6G
drw-------. 3 root root   16 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root  16K Aug 13  2020 CHANGELOG.1597343199
-rw-r--r--. 1 root root  71K Aug 13  2020 CHANGELOG.1597343214
-rw-r--r--. 1 root root 122K Aug 13  2020 CHANGELOG.1597343229
-rw-r--r--. 1 root root  73K Aug 13  2020 CHANGELOG.1597343244
-rw-r--r--. 1 root root 100K Aug 13  2020 CHANGELOG.1597343259
-rw-r--r--. 1 root root  95K Aug 13  2020 CHANGELOG.1597343274
-rw-r--r--. 1 root root  92K Aug 13  2020 CHANGELOG.1597343289

[root@storage01 changelogs]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:33 CHANGELOG.1614663193
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663731
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663760
-rw-r--r--. 1 root root  511 Mar  1 21:47 CHANGELOG.1614664043
-rw-r--r--. 1 root root  536 Mar  1 21:48 CHANGELOG.1614664101
-rw-r--r--. 1 root root 2.8K Mar  1 21:48 CHANGELOG.1614664116
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666061
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666554
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_b/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663731
-rw-r--r--. 1 root root  480 Mar  1 21:42 CHANGELOG.1614663745
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663760
-rw-r--r--. 1 root root  524 Mar  1 21:47 CHANGELOG.1614664043
-rw-r--r--. 1 root root  495 Mar  1 21:48 CHANGELOG.1614664100
-rw-r--r--. 1 root root 1.6K Mar  1 21:48 CHANGELOG.1614664114
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666060
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666553
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_c/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663738
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663753
-rw-r--r--. 1 root root  395 Mar  1 21:47 CHANGELOG.1614664051
-rw-r--r--. 1 root root  316 Mar  1 21:48 CHANGELOG.1614664094
-rw-r--r--. 1 root root 1.2K Mar  1 21:48 CHANGELOG.1614664109
-rw-r--r--. 1 root root  174 Mar  1 21:48 CHANGELOG.1614664123
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666061
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666553
drw-------. 2 root root    6 May  7  2020 csnap
drw-------. 2 root root   30 Aug 13  2020 htime

[root@storage02 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
total 9.6G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root 4.2K Aug 13  2020 CHANGELOG.1597343193
-rw-r--r--. 1 root root  32K Aug 13  2020 CHANGELOG.1597343208
-rw-r--r--. 1 root root 107K Aug 13  2020 CHANGELOG.1597343223
-rw-r--r--. 1 root root 120K Aug 13  2020 CHANGELOG.1597343238
-rw-r--r--. 1 root root  72K Aug 13  2020 CHANGELOG.1597343253
-rw-r--r--. 1 root root 111K Aug 13  2020 CHANGELOG.1597343268
-rw-r--r--. 1 root root  91K Aug 13  2020 CHANGELOG.1597343283

/data/storage_b/storage/.glusterfs/changelogs
total 16G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root 3.9K Aug 13  2020 CHANGELOG.1597343193
-rw-r--r--. 1 root root  35K Aug 13  2020 CHANGELOG.1597343208
-rw-r--r--. 1 root root  85K Aug 13  2020 CHANGELOG.1597343223
-rw-r--r--. 1 root root 103K Aug 13  2020 CHANGELOG.1597343238
-rw-r--r--. 1 root root  70K Aug 13  2020 CHANGELOG.1597343253
-rw-r--r--. 1 root root  72K Aug 13  2020 CHANGELOG.1597343268
-rw-r--r--. 1 root root  73K Aug 13  2020 CHANGELOG.1597343283

/data/storage_c/storage/.glusterfs/changelogs
total 3.3G
drw-------. 3 root root   16 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:51 CHANGELOG
-rw-r--r--. 1 root root  21K Aug 13  2020 CHANGELOG.1597343202
-rw-r--r--. 1 root root  75K Aug 13  2020 CHANGELOG.1597343217
-rw-r--r--. 1 root root  92K Aug 13  2020 CHANGELOG.1597343232
-rw-r--r--. 1 root root  77K Aug 13  2020 CHANGELOG.1597343247
-rw-r--r--. 1 root root  66K Aug 13  2020 CHANGELOG.1597343262
-rw-r--r--. 1 root root  84K Aug 13  2020 CHANGELOG.1597343277
-rw-r--r--. 1 root root  81K Aug 13  2020 CHANGELOG.1597343292

[root@storage02 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663734
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663749
-rw-r--r--. 1 root root  395 Mar  1 21:47 CHANGELOG.1614664052
-rw-r--r--. 1 root root  316 Mar  1 21:48 CHANGELOG.1614664096
-rw-r--r--. 1 root root 1.2K Mar  1 21:48 CHANGELOG.1614664111
-rw-r--r--. 1 root root  174 Mar  1 21:48 CHANGELOG.1614664126
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666056
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666560
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_b/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663735
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663749
-rw-r--r--. 1 root root  511 Mar  1 21:47 CHANGELOG.1614664052
-rw-r--r--. 1 root root  316 Mar  1 21:48 CHANGELOG.1614664096
-rw-r--r--. 1 root root 1.8K Mar  1 21:48 CHANGELOG.1614664111
-rw-r--r--. 1 root root 1.4K Mar  1 21:48 CHANGELOG.1614664126
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666060
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666556
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_c/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663738
-rw-r--r--. 1 root root  521 Mar  1 21:42 CHANGELOG.1614663752
-rw-r--r--. 1 root root  524 Mar  1 21:47 CHANGELOG.1614664042
-rw-r--r--. 1 root root   92 Mar  1 21:47 CHANGELOG.1614664057
-rw-r--r--. 1 root root  536 Mar  1 21:48 CHANGELOG.1614664102
-rw-r--r--. 1 root root 1.6K Mar  1 21:48 CHANGELOG.1614664117
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666057
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666550
drw-------. 2 root root    6 May  7  2020 csnap
drw-------. 2 root root   30 Aug 13  2020 htime


[root@storage03 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | head; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
total 3.4G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:50 CHANGELOG
-rw-r--r--. 1 root root  19K Aug 13  2020 CHANGELOG.1597343201
-rw-r--r--. 1 root root  66K Aug 13  2020 CHANGELOG.1597343215
-rw-r--r--. 1 root root  91K Aug 13  2020 CHANGELOG.1597343230
-rw-r--r--. 1 root root  82K Aug 13  2020 CHANGELOG.1597343245
-rw-r--r--. 1 root root  64K Aug 13  2020 CHANGELOG.1597343259
-rw-r--r--. 1 root root  75K Aug 13  2020 CHANGELOG.1597343274
-rw-r--r--. 1 root root  81K Aug 13  2020 CHANGELOG.1597343289

/data/storage_b/storage/.glusterfs/changelogs
total 9.6G
drw-------. 3 root root   24 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:51 CHANGELOG
-rw-r--r--. 1 root root  19K Aug 13  2020 CHANGELOG.1597343201
-rw-r--r--. 1 root root  80K Aug 13  2020 CHANGELOG.1597343215
-rw-r--r--. 1 root root 119K Aug 13  2020 CHANGELOG.1597343230
-rw-r--r--. 1 root root  65K Aug 13  2020 CHANGELOG.1597343244
-rw-r--r--. 1 root root 100K Aug 13  2020 CHANGELOG.1597343259
-rw-r--r--. 1 root root  95K Aug 13  2020 CHANGELOG.1597343274
-rw-r--r--. 1 root root  92K Aug 13  2020 CHANGELOG.1597343289

/data/storage_c/storage/.glusterfs/changelogs
total 16G
drw-------. 3 root root   16 Mar  9 11:34 2021
-rw-r--r--. 1 root root   51 Mar 12 09:51 CHANGELOG
-rw-r--r--. 1 root root 3.9K Aug 13  2020 CHANGELOG.1597343193
-rw-r--r--. 1 root root  35K Aug 13  2020 CHANGELOG.1597343208
-rw-r--r--. 1 root root  85K Aug 13  2020 CHANGELOG.1597343223
-rw-r--r--. 1 root root 103K Aug 13  2020 CHANGELOG.1597343238
-rw-r--r--. 1 root root  70K Aug 13  2020 CHANGELOG.1597343253
-rw-r--r--. 1 root root  71K Aug 13  2020 CHANGELOG.1597343268
-rw-r--r--. 1 root root  73K Aug 13  2020 CHANGELOG.1597343283

[root@storage03 ~]# for dirs in {a,b,c}; do echo "/data/storage_$dirs/storage/.glusterfs/changelogs"; ls -lh /data/storage_$dirs/storage/.glusterfs/changelogs | tail; echo ""; done
/data/storage_a/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root   92 Mar  1 21:33 CHANGELOG.1614663183
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663740
-rw-r--r--. 1 root root  521 Mar  1 21:42 CHANGELOG.1614663755
-rw-r--r--. 1 root root  524 Mar  1 21:47 CHANGELOG.1614664049
-rw-r--r--. 1 root root 1.9K Mar  1 21:48 CHANGELOG.1614664106
-rw-r--r--. 1 root root  174 Mar  1 21:48 CHANGELOG.1614664121
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666051
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666559
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_b/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root  474 Mar  1 21:33 CHANGELOG.1614663182
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663739
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663753
-rw-r--r--. 1 root root  395 Mar  1 21:47 CHANGELOG.1614664049
-rw-r--r--. 1 root root 1.4K Mar  1 21:48 CHANGELOG.1614664106
-rw-r--r--. 1 root root  174 Mar  1 21:48 CHANGELOG.1614664120
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666063
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666557
drw-------. 2 root root   10 May  7  2020 csnap
drw-------. 2 root root   38 Aug 13  2020 htime

/data/storage_c/storage/.glusterfs/changelogs
-rw-r--r--. 1 root root  468 Mar  1 21:33 CHANGELOG.1614663183
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663740
-rw-r--r--. 1 root root   92 Mar  1 21:42 CHANGELOG.1614663754
-rw-r--r--. 1 root root  511 Mar  1 21:47 CHANGELOG.1614664048
-rw-r--r--. 1 root root 2.0K Mar  1 21:48 CHANGELOG.1614664105
-rw-r--r--. 1 root root 1.4K Mar  1 21:48 CHANGELOG.1614664120
-rw-r--r--. 1 root root   92 Mar  1 22:20 CHANGELOG.1614666063
-rw-r--r--. 1 root root   92 Mar  1 22:29 CHANGELOG.1614666556
drw-------. 2 root root    6 May  7  2020 csnap
drw-------. 2 root root   30 Aug 13  2020 htime

Thanks,
 -Matthew

--
Matthew Benstead
System Administrator
Pacific Climate Impacts Consortium
University of Victoria, UH1
PO Box 1800, STN CSC
Victoria, BC, V8W 2Y2
Phone: +1-250-721-8432
Email: matthewb@xxxxxxx

On 3/11/21 11:37 PM, Strahil Nikolov wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.

Have you checked the secondary volume nodes' logs & SELINUX status ?

Best Regards,
Strahil Nikolov

On Thu, Mar 11, 2021 at 21:36, Matthew Benstead
Hi Strahil,

It looks like perhaps the changelog_log_level and log_level options? I've set them to debug:

[root@storage01 ~]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup config | egrep -i "log_level"
changelog_log_level:INFO
cli_log_level:INFO
gluster_log_level:INFO
log_level:INFO
slave_gluster_log_level:INFO
slave_log_level:INFO

[root@storage01 ~]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup config changelog_log_level DEBUG
geo-replication config updated successfully

[root@storage01 ~]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup config log_level DEBUG
geo-replication config updated successfully


Then I restarted geo-replication:

[root@storage01 ~]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup start
Starting geo-replication session between storage & geoaccount@10.0.231.81::pcic-backup has been successful
[root@storage01 ~]# gluster volume geo-replication status
 
MASTER NODE    MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                        SLAVE NODE    STATUS             CRAWL STATUS    LAST_SYNCED         
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.91    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.91    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.92    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
10.0.231.93    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Initializing...    N/A             N/A                 
[root@storage01 ~]# gluster volume geo-replication status
 
MASTER NODE    MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                        SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED         
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.91    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.91    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.92    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_c/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_b/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A                 
10.0.231.93    storage       /data/storage_a/storage    geoaccount    ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A 

[root@storage01 ~]# gluster volume geo-replication storage geoaccount@10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage & geoaccount@10.0.231.81::pcic-backup has been successful


The changelogs didn't really show anything new around changelog selection:

[root@storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log | egrep "2021-03-11"
[2021-03-11 19:15:30.552889] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}]
[2021-03-11 19:15:30.552893] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}]
[2021-03-11 19:15:30.552894] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2021-03-11 19:15:30.553633] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-11 19:15:30.553634] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-11 19:15:30.554236] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited.
[2021-03-11 19:15:30.554403] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2021-03-11 19:15:30.554420] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:30.554933] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay
[2021-03-11 19:15:30.554944] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:30.554949] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9
[2021-03-11 19:15:30.555002] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23
[2021-03-11 19:15:30.555324] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0
[2021-03-11 19:15:30.555345] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins
[2021-03-11 19:15:30.555351] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout
[2021-03-11 19:15:30.555358] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:30.555399] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay
[2021-03-11 19:15:30.555406] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:32.555711] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning
[2021-03-11 19:15:32.572157] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615490132}]
[2021-03-11 19:15:32.572436] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490121}, {total_changelogs=1256897}]
[2021-03-11 19:15:32.621244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for="" {start=1615490121}, {idx=1256896}]
[2021-03-11 19:15:46.733182] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}]
[2021-03-11 19:15:46.733316] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}]
[2021-03-11 19:15:46.733348] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2021-03-11 19:15:46.734031] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-11 19:15:46.734085] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-11 19:15:46.734591] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited.
[2021-03-11 19:15:46.734755] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2021-03-11 19:15:46.734772] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:46.735256] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay
[2021-03-11 19:15:46.735266] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:46.735271] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9
[2021-03-11 19:15:46.735325] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21
[2021-03-11 19:15:46.735704] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0
[2021-03-11 19:15:46.735721] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins
[2021-03-11 19:15:46.735726] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout
[2021-03-11 19:15:46.735733] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:46.735771] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay
[2021-03-11 19:15:46.735778] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:47.618464] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning


[root@storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_b-storage.log | egrep "2021-03-11"
[2021-03-11 19:15:30.611457] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}]
[2021-03-11 19:15:30.611574] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2021-03-11 19:15:30.611641] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-11 19:15:30.611645] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-11 19:15:30.612325] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited.
[2021-03-11 19:15:30.612488] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2021-03-11 19:15:30.612507] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:30.613005] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay
[2021-03-11 19:15:30.613130] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:30.613142] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9
[2021-03-11 19:15:30.613208] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 22
[2021-03-11 19:15:30.613545] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0
[2021-03-11 19:15:30.613567] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins
[2021-03-11 19:15:30.613574] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout
[2021-03-11 19:15:30.613582] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:30.613637] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay
[2021-03-11 19:15:30.613654] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:32.614273] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning
[2021-03-11 19:15:32.643628] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615490132}]
[2021-03-11 19:15:32.643716] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615490123}, {total_changelogs=1264296}]
[2021-03-11 19:15:32.700397] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for="" {start=1615490123}, {idx=1264295}]
[2021-03-11 19:15:46.832322] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}]
[2021-03-11 19:15:46.832394] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=0}]
[2021-03-11 19:15:46.832465] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=1}]
[2021-03-11 19:15:46.832531] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-11 19:15:46.833086] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-11 19:15:46.833648] D [rpcsvc.c:2831:rpcsvc_init] 0-rpc-service: RPC service inited.
[2021-03-11 19:15:46.833817] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: GF-DUMP, Num: 123451501, Ver: 1, Port: 0
[2021-03-11 19:15:46.833835] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:46.834368] D [socket.c:4485:socket_init] 0-socket.gfchangelog: disabling nodelay
[2021-03-11 19:15:46.834380] D [socket.c:4523:socket_init] 0-socket.gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:46.834386] D [socket.c:4543:socket_init] 0-socket.gfchangelog: Reconfigured transport.keepalivecnt=9
[2021-03-11 19:15:46.834441] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23
[2021-03-11 19:15:46.834768] D [rpcsvc.c:2342:rpcsvc_program_register] 0-rpc-service: New program registered: LIBGFCHANGELOG REBORP, Num: 1886350951, Ver: 1, Port: 0
[2021-03-11 19:15:46.834789] D [rpc-clnt.c:1020:rpc_clnt_connection_init] 0-gfchangelog: defaulting frame-timeout to 30mins
[2021-03-11 19:15:46.834795] D [rpc-clnt.c:1032:rpc_clnt_connection_init] 0-gfchangelog: disable ping-timeout
[2021-03-11 19:15:46.834802] D [rpc-transport.c:278:rpc_transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/8.3/rpc-transport/socket.so
[2021-03-11 19:15:46.834845] D [socket.c:4485:socket_init] 0-gfchangelog: disabling nodelay
[2021-03-11 19:15:46.834853] D [socket.c:4523:socket_init] 0-gfchangelog: Configured transport.tcp-user-timeout=42
[2021-03-11 19:15:47.618476] D [rpc-clnt-ping.c:298:rpc_clnt_start_ping] 0-gfchangelog: ping timeout is 0, returning


gsyncd logged a lot but I'm not sure if it's helpful:

[2021-03-11 19:15:00.41898] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:21.551302] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:21.631470] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:21.718386] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:21.804991] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:26.203999] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:26.284775] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:26.573355] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:26.653752] D [gsyncd(monitor):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:26.756994] D [monitor(monitor):304:distribute] <top>: master bricks: [{'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, {'host': '10.0.2
31.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_c/storage'}, {'host': '10.
0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_a/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_b/storage'}, {'host': '
10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, {'host': '10.0.231.93', 'uuid': '8b28b331-3780-46bc-9da3-fb27de4ab57b', 'dir': '/data/storage_a/storage'}, {'host'
: '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, {'host': '10.0.231.92', 'uuid': 'ebbd7b74-3cf8-4752-a71c-b0f0ca86c97d', 'dir': '/data/storage_c/storage'}]
[2021-03-11 19:15:26.757252] D [monitor(monitor):314:distribute] <top>: slave SSH gateway: geoaccount@10.0.231.81
[2021-03-11 19:15:27.416235] D [monitor(monitor):334:distribute] <top>: slave bricks: [{'host': '10.0.231.81', 'uuid': 'b88dea4f-31ec-416a-9110-3ccdc3910acd', 'dir': '/data/brick'}, {'host': '10.0.231.82', 'uuid
': 'be50a8de-3934-4fee-a80d-8e2e99017902', 'dir': '/data/brick'}]
[2021-03-11 19:15:27.416825] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_a/storage'
[2021-03-11 19:15:27.417273] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_c/storage'
[2021-03-11 19:15:27.417515] D [syncdutils(monitor):932:is_hot] Volinfo: brickpath: '10.0.231.91:/data/storage_b/storage'
[2021-03-11 19:15:27.417763] D [monitor(monitor):348:distribute] <top>: worker specs: [({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_a/storage'}, ('geoaccount@10.
0.231.81', 'b88dea4f-31ec-416a-9110-3ccdc3910acd'), '1', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_c/storage'}, ('geoaccount@10.0.231.82', 'be50a8de-3
934-4fee-a80d-8e2e99017902'), '2', False), ({'host': '10.0.231.91', 'uuid': 'afc24654-2887-41f6-a9c2-8e835de243b6', 'dir': '/data/storage_b/storage'}, ('geoaccount@10.0.231.82', 'be50a8de-3934-4fee-a80d-8e2e9901
7902'), '3', False)]
[2021-03-11 19:15:27.425009] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}]
[2021-03-11 19:15:27.426764] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}]
[2021-03-11 19:15:27.429208] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}]
[2021-03-11 19:15:27.432280] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:27.434195] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:27.436584] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:27.478806] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:27.478852] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:27.480104] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:27.500456] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:27.501375] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:27.502003] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:27.525511] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__() ...
[2021-03-11 19:15:27.525582] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__() ...
[2021-03-11 19:15:27.526089] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__() ...
[2021-03-11 19:15:29.435985] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490127.53 __repce_version__ -> 1.0
[2021-03-11 19:15:29.436213] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 version() ...
[2021-03-11 19:15:29.437136] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 version -> 1.0
[2021-03-11 19:15:29.437268] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140572692309824:1615490129.44 pid() ...
[2021-03-11 19:15:29.437915] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140572692309824:1615490129.44 pid -> 157321
[2021-03-11 19:15:29.438004] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9359}]
[2021-03-11 19:15:29.438072] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:29.494538] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490127.53 __repce_version__ -> 1.0
[2021-03-11 19:15:29.494748] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.49 version() ...
[2021-03-11 19:15:29.495290] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.49 version -> 1.0
[2021-03-11 19:15:29.495400] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139891296405312:1615490129.5 pid() ...
[2021-03-11 19:15:29.495872] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139891296405312:1615490129.5 pid -> 88110
[2021-03-11 19:15:29.495960] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9944}]
[2021-03-11 19:15:29.496028] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:29.501255] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490127.53 __repce_version__ -> 1.0
[2021-03-11 19:15:29.501454] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 version() ...
[2021-03-11 19:15:29.502258] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 version -> 1.0
[2021-03-11 19:15:29.502444] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140388828780352:1615490129.5 pid() ...
[2021-03-11 19:15:29.503140] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140388828780352:1615490129.5 pid -> 88111
[2021-03-11 19:15:29.503232] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0026}]
[2021-03-11 19:15:29.503302] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:29.533899] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:29.595736] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:29.601110] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:30.541542] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:30.541816] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1037}]
[2021-03-11 19:15:30.541887] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:30.542042] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:30.542125] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected
[2021-03-11 19:15:30.543323] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:30.544460] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:30.552103] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage
[2021-03-11 19:15:30.602937] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:30.603117] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1070}]
[2021-03-11 19:15:30.603197] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:30.603353] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:30.603338] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected
[2021-03-11 19:15:30.604620] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:30.605600] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:30.608365] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:30.608534] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1052}]
[2021-03-11 19:15:30.608612] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:30.608762] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:30.608779] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected
[2021-03-11 19:15:30.610033] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:30.610637] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2021-03-11 19:15:30.610970] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:30.616197] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage
[2021-03-11 19:15:31.371265] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:31.451000] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:31.537257] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:31.623800] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:32.555840] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage
[2021-03-11 19:15:32.556051] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage
[2021-03-11 19:15:32.556122] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage
[2021-03-11 19:15:32.556179] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}]
[2021-03-11 19:15:32.556359] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}]
[2021-03-11 19:15:32.556823] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive(None,) ...
[2021-03-11 19:15:32.558429] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192117:140570487928576:1615490132.56 keep_alive -> 1
[2021-03-11 19:15:32.558974] D [master(worker /data/storage_a/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ...
[2021-03-11 19:15:32.567478] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-11 19:15:32.571824] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-11 19:15:32.572052] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615490132}]
[2021-03-11 19:15:32.614506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2021-03-11 19:15:32.614701] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2021-03-11 19:15:32.614788] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2021-03-11 19:15:32.614845] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}]
[2021-03-11 19:15:32.615000] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}]
[2021-03-11 19:15:32.615586] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive(None,) ...
[2021-03-11 19:15:32.617373] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192115:139889215526656:1615490132.62 keep_alive -> 1
[2021-03-11 19:15:32.618144] D [master(worker /data/storage_b/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ...
[2021-03-11 19:15:32.619323] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage
[2021-03-11 19:15:32.619491] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage
[2021-03-11 19:15:32.619739] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage
[2021-03-11 19:15:32.619863] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}]
[2021-03-11 19:15:32.620040] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615490132}]
[2021-03-11 19:15:32.620599] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive(None,) ...
[2021-03-11 19:15:32.621397] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-11 19:15:32.622035] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192114:140386886469376:1615490132.62 keep_alive -> 1
[2021-03-11 19:15:32.622701] D [master(worker /data/storage_c/storage):540:crawlwrap] _GMaster: primary master with volume id cf94a8f2-324b-40b3-bf72-c3766100ea99 ...
[2021-03-11 19:15:32.627031] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-11 19:15:32.643184] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-11 19:15:32.643528] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615490132}]
[2021-03-11 19:15:32.645148] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-11 19:15:32.649631] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-11 19:15:32.649882] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615490132}]
[2021-03-11 19:15:32.650907] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-11 19:15:32.700489] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-11 19:15:33.545886] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}]
[2021-03-11 19:15:33.550487] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:33.606991] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}]
[2021-03-11 19:15:33.611573] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:33.612337] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}]
[2021-03-11 19:15:33.615777] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:34.684247] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:34.764971] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:34.851174] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:34.937166] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:36.994502] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:37.73805] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:37.159288] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:37.244153] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:38.916510] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:38.997649] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:39.84816] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:39.172045] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:40.896359] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:40.976135] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:41.62052] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:41.147902] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:42.791997] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:42.871239] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:42.956609] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:43.42473] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:43.566190] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-11 19:15:43.566400] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}]
[2021-03-11 19:15:43.572240] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:43.612744] D [gsyncd(worker /data/storage_a/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:43.625689] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-11 19:15:43.626060] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}]
[2021-03-11 19:15:43.632287] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-11 19:15:43.632137] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:43.632508] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}]
[2021-03-11 19:15:43.635565] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:43.637835] D [monitor(monitor):195:monitor] Monitor: Worker would mount volume privately
[2021-03-11 19:15:43.661304] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__() ...
[2021-03-11 19:15:43.674499] D [gsyncd(worker /data/storage_b/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:43.680706] D [gsyncd(worker /data/storage_c/storage):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:43.693773] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:43.700957] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-11 19:15:43.717686] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__() ...
[2021-03-11 19:15:43.725369] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__() ...
[2021-03-11 19:15:44.289117] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:44.375693] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:44.472251] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:44.558429] D [gsyncd(status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:45.619694] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490143.66 __repce_version__ -> 1.0
[2021-03-11 19:15:45.619930] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 version() ...
[2021-03-11 19:15:45.621191] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 version -> 1.0
[2021-03-11 19:15:45.621332] D [repce(worker /data/storage_a/storage):195:push] RepceClient: call 192535:140367272073024:1615490145.62 pid() ...
[2021-03-11 19:15:45.621859] D [repce(worker /data/storage_a/storage):215:__call__] RepceClient: call 192535:140367272073024:1615490145.62 pid -> 158229
[2021-03-11 19:15:45.621939] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9862}]
[2021-03-11 19:15:45.622000] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:45.714468] D [resource(worker /data/storage_a/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:45.718441] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490143.73 __repce_version__ -> 1.0
[2021-03-11 19:15:45.718643] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 version() ...
[2021-03-11 19:15:45.719492] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 version -> 1.0
[2021-03-11 19:15:45.719772] D [repce(worker /data/storage_c/storage):195:push] RepceClient: call 192541:140653101852480:1615490145.72 pid() ...
[2021-03-11 19:15:45.720202] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490143.72 __repce_version__ -> 1.0
[2021-03-11 19:15:45.720381] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 version() ...
[2021-03-11 19:15:45.720463] D [repce(worker /data/storage_c/storage):215:__call__] RepceClient: call 192541:140653101852480:1615490145.72 pid -> 88921
[2021-03-11 19:15:45.720694] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0196}]
[2021-03-11 19:15:45.720882] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:45.721146] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 version -> 1.0
[2021-03-11 19:15:45.721271] D [repce(worker /data/storage_b/storage):195:push] RepceClient: call 192539:139907321804608:1615490145.72 pid() ...
[2021-03-11 19:15:45.721795] D [repce(worker /data/storage_b/storage):215:__call__] RepceClient: call 192539:139907321804608:1615490145.72 pid -> 88924
[2021-03-11 19:15:45.721911] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=2.0280}]
[2021-03-11 19:15:45.721993] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-11 19:15:45.816891] D [resource(worker /data/storage_b/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:45.816960] D [resource(worker /data/storage_c/storage):880:inhibit] DirectMounter: auxiliary glusterfs mount in place
[2021-03-11 19:15:46.721534] D [resource(worker /data/storage_a/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:46.721726] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.0997}]
[2021-03-11 19:15:46.721796] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:46.721971] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:46.722122] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_a/storage) connected
[2021-03-11 19:15:46.723871] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:46.725100] D [master(worker /data/storage_a/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:46.732400] D [master(worker /data/storage_a/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage
[2021-03-11 19:15:46.823477] D [resource(worker /data/storage_c/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:46.823645] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1027}]
[2021-03-11 19:15:46.823754] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:46.823932] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:46.823904] D [resource(worker /data/storage_b/storage):964:inhibit] DirectMounter: auxiliary glusterfs mount prepared
[2021-03-11 19:15:46.823930] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_c/storage) connected
[2021-03-11 19:15:46.824103] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1020}]
[2021-03-11 19:15:46.824184] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-11 19:15:46.824340] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=xsync}]
[2021-03-11 19:15:46.824321] D [monitor(monitor):222:monitor] Monitor: worker(/data/storage_b/storage) connected
[2021-03-11 19:15:46.825100] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:46.825414] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changelog}]
[2021-03-11 19:15:46.826375] D [master(worker /data/storage_b/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:46.826574] D [master(worker /data/storage_c/storage):105:gmaster_builder] <top>: setting up change detection mode [{mode=changeloghistory}]
[2021-03-11 19:15:46.831506] D [master(worker /data/storage_b/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage
[2021-03-11 19:15:46.833168] D [master(worker /data/storage_c/storage):778:setup_working_dir] _GMaster: changelog working dir /var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage
[2021-03-11 19:15:47.275141] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:47.320247] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:47.570877] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:47.615571] D [gsyncd(config-get):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:47.620893] E [syncdutils(worker /data/storage_a/storage):325:log_raise_exception] <top>: connection to peer is broken
[2021-03-11 19:15:47.620939] E [syncdutils(worker /data/storage_c/storage):325:log_raise_exception] <top>: connection to peer is broken
[2021-03-11 19:15:47.621668] E [syncdutils(worker /data/storage_a/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-_AyCOc/79fa3dc75e30f532b4a40bc08c2b10a1.sock geoaccount@10.0.231.81 /nonexistent/gsyncd slave storage geoaccount@10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_a/storage --local-node 10.0.231.81 --local-node-id b88dea4f-31ec-416a-9110-3ccdc3910acd --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}]
[2021-03-11 19:15:47.621685] E [syncdutils(worker /data/storage_c/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-WOgOEu/e15fc58bb13552de0710eaf018209548.sock geoaccount@10.0.231.82 /nonexistent/gsyncd slave storage geoaccount@10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_c/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}]
[2021-03-11 19:15:47.621776] E [syncdutils(worker /data/storage_a/storage):851:logerr] Popen: ssh> Killed by signal 15.
[2021-03-11 19:15:47.621819] E [syncdutils(worker /data/storage_c/storage):851:logerr] Popen: ssh> Killed by signal 15.
[2021-03-11 19:15:47.621850] E [syncdutils(worker /data/storage_b/storage):325:log_raise_exception] <top>: connection to peer is broken
[2021-03-11 19:15:47.622437] E [syncdutils(worker /data/storage_b/storage):847:errlog] Popen: command returned error [{cmd=ssh -oPasswordAuthentication=no -oStrictHostKeyChecking=no -i /var/lib/glusterd/geo-replication/secret.pem -p 22 -oControlMaster=auto -S /tmp/gsyncd-aux-ssh-Vy935W/e15fc58bb13552de0710eaf018209548.sock geoaccount@10.0.231.82 /nonexistent/gsyncd slave storage geoaccount@10.0.231.81::pcic-backup --master-node 10.0.231.91 --master-node-id afc24654-2887-41f6-a9c2-8e835de243b6 --master-brick /data/storage_b/storage --local-node 10.0.231.82 --local-node-id be50a8de-3934-4fee-a80d-8e2e99017902 --slave-timeout 120 --slave-log-level INFO --slave-gluster-log-level INFO --slave-gluster-command-dir /usr/sbin --master-dist-count 3}, {error=255}]
[2021-03-11 19:15:47.622556] E [syncdutils(worker /data/storage_b/storage):851:logerr] Popen: ssh> Killed by signal 15.
[2021-03-11 19:15:47.723756] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}]
[2021-03-11 19:15:47.731405] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:47.825223] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}]
[2021-03-11 19:15:47.825685] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}]
[2021-03-11 19:15:47.829011] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:47.830965] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-11 19:15:48.669634] D [gsyncd(monitor-status):303:main] <top>: Using session config file [{path=/var/lib/glusterd/geo-replication/storage_10.0.231.81_pcic-backup/gsyncd.conf}]
[2021-03-11 19:15:48.683784] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}]


Thanks,
 -Matthew


On 3/11/21 9:37 AM, Strahil Nikolov wrote:
Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.


I think you have to increase the debug logs for geo-rep session.
I will try to find the command necessary to increase it.


Best Regards,
Strahil Nikolov






В четвъртък, 11 март 2021 г., 00:38:41 ч. Гринуич+2, Matthew Benstead <matthewb@xxxxxxx> написа:





Thanks Strahil,

Right - I had come across your message in early January that v8 from the CentOS Sig was missing the SELinux rules, and had put SELinux into permissive mode after the upgrade when I saw denied messages in the audit logs.

[root@storage01 ~]# sestatus | egrep "^SELinux status|[mM]ode"
SELinux status:                 enabled
Current mode:                   permissive
Mode from config file:          enforcing

Yes - I am using an unprivileged user for georep:

[root@pcic-backup01 ~]# gluster-mountbroker status
+-------------+-------------+---------------------------+--------------+--------------------------+
|     NODE    | NODE STATUS |         MOUNT ROOT        |    GROUP     |          USERS           |
+-------------+-------------+---------------------------+--------------+--------------------------+
| 10.0.231.82 |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
|  localhost  |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
+-------------+-------------+---------------------------+--------------+--------------------------+

[root@pcic-backup02 ~]# gluster-mountbroker status
+-------------+-------------+---------------------------+--------------+--------------------------+
|     NODE    | NODE STATUS |         MOUNT ROOT        |    GROUP     |          USERS           |
+-------------+-------------+---------------------------+--------------+--------------------------+
| 10.0.231.81 |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
|  localhost  |          UP | /var/mountbroker-root(OK) | geogroup(OK) | geoaccount(pcic-backup)  |
+-------------+-------------+---------------------------+--------------+--------------------------+

Thanks,
 -Matthew


--
Matthew Benstead
System AdministratorPacific Climate Impacts ConsortiumUniversity of Victoria, UH1PO Box 1800, STN CSCVictoria, BC, V8W 2Y2Phone: +1-250-721-8432Email: matthewb@xxxxxxx


On 3/10/21 2:11 PM, Strahil Nikolov wrote:


  
  Notice: This message was sent from outside the University of Victoria email system. Please be cautious with links and sensitive information.


I have tested georep on v8.3 and it was running quite well untill you involve SELINUX.



Are you using SELINUX ?

Are you using unprivileged user for the georep ?




Also, you can check https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.4/html/administration_guide/sect-troubleshooting_geo-replication .




Best Regards,

Strahil Nikolov


  
  
On Thu, Mar 11, 2021 at 0:03, Matthew Benstead

<matthewb@xxxxxxx> wrote:


  
  
Hello,

I recently upgraded my Distributed-Replicate cluster from Gluster 7.9 to 8.3 on CentOS7 using the CentOS Storage SIG packages. I had geo-replication syncing properly before the upgrade, but not it is not working after.

After I had upgraded both master and slave clusters I attempted to start geo-replication again, but it goes to faulty quickly:

[root@storage01 ~]# gluster volume geo-replication storage  geoaccount@10.0.231.81::pcic-backup start
Starting geo-replication session between storage &  geoaccount@10.0.231.81::pcic-backup has been successful\

[root@storage01 ~]# gluster volume geo-replication status

MASTER NODE    MASTER VOL    MASTER BRICK               SLAVE USER    SLAVE                                        SLAVE NODE    STATUS    CRAWL STATUS    LAST_SYNCED
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.91    storage       /data/storage_a/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.91    storage       /data/storage_c/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.91    storage       /data/storage_b/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.92    storage       /data/storage_b/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.92    storage       /data/storage_a/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.92    storage       /data/storage_c/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.93    storage       /data/storage_c/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.93    storage       /data/storage_b/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A
10.0.231.93    storage       /data/storage_a/storage    geoaccount     ssh://geoaccount@10.0.231.81::pcic-backup    N/A           Faulty    N/A             N/A

[root@storage01 ~]# gluster volume geo-replication storage  geoaccount@10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage &  geoaccount@10.0.231.81::pcic-backup has been successful


I went through the gsyncd logs and see it attempts to go back through the changlogs - which would make sense - but fails:

[2021-03-10 19:18:42.165807] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-10 19:18:42.166136] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}]
[2021-03-10 19:18:42.167829] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}]
[2021-03-10 19:18:42.172343] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-10 19:18:42.172580] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}]
[2021-03-10 19:18:42.235574] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:42.236613] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:42.238614] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:44.144856] I [resource(worker /data/storage_b/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9059}]
[2021-03-10 19:18:44.145065] I [resource(worker /data/storage_b/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-10 19:18:44.162873] I [resource(worker /data/storage_a/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9259}]
[2021-03-10 19:18:44.163412] I [resource(worker /data/storage_a/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-10 19:18:44.167506] I [resource(worker /data/storage_c/storage):1436:connect_remote] SSH: SSH connection between master and slave established. [{duration=1.9316}]
[2021-03-10 19:18:44.167746] I [resource(worker /data/storage_c/storage):1116:connect] GLUSTER: Mounting gluster volume locally...
[2021-03-10 19:18:45.251372] I [resource(worker /data/storage_b/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1062}]
[2021-03-10 19:18:45.251583] I [subcmds(worker /data/storage_b/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-10 19:18:45.271950] I [resource(worker /data/storage_c/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1041}]
[2021-03-10 19:18:45.272118] I [subcmds(worker /data/storage_c/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-10 19:18:45.275180] I [resource(worker /data/storage_a/storage):1139:connect] GLUSTER: Mounted gluster volume [{duration=1.1116}]
[2021-03-10 19:18:45.275361] I [subcmds(worker /data/storage_a/storage):84:subcmd_worker] <top>: Worker spawn successful. Acknowledging back to monitor
[2021-03-10 19:18:47.265618] I [master(worker /data/storage_b/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_b-storage}]
[2021-03-10 19:18:47.265954] I [resource(worker /data/storage_b/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}]
[2021-03-10 19:18:47.276746] I [gsyncdstatus(worker /data/storage_b/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-10 19:18:47.281194] I [gsyncdstatus(worker /data/storage_b/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-10 19:18:47.281404] I [master(worker /data/storage_b/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664113, 0)}, {etime=1615403927}]
[2021-03-10 19:18:47.285340] I [master(worker /data/storage_c/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_c-storage}]
[2021-03-10 19:18:47.285579] I [resource(worker /data/storage_c/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}]
[2021-03-10 19:18:47.287383] I [master(worker /data/storage_a/storage):1645:register] _GMaster: Working dir [{path=/var/lib/misc/gluster/gsyncd/storage_10.0.231.81_pcic-backup/data-storage_a-storage}]
[2021-03-10 19:18:47.287697] I [resource(worker /data/storage_a/storage):1292:service_loop] GLUSTER: Register time [{time=1615403927}]
[2021-03-10 19:18:47.298415] I [gsyncdstatus(worker /data/storage_c/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-10 19:18:47.301342] I [gsyncdstatus(worker /data/storage_a/storage):281:set_active] GeorepStatus: Worker Status Change [{status=Active}]
[2021-03-10 19:18:47.304183] I [gsyncdstatus(worker /data/storage_c/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-10 19:18:47.304418] I [master(worker /data/storage_c/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666552, 0)}, {entry_stime=(1614664108, 0)}, {etime=1615403927}]
[2021-03-10 19:18:47.305294] E [resource(worker /data/storage_c/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-10 19:18:47.308124] I [gsyncdstatus(worker /data/storage_a/storage):253:set_worker_crawl_status] GeorepStatus: Crawl Status Change [{status=History Crawl}]
[2021-03-10 19:18:47.308509] I [master(worker /data/storage_a/storage):1559:crawl] _GMaster: starting history crawl [{turns=1}, {stime=(1614666553, 0)}, {entry_stime=(1614664115, 0)}, {etime=1615403927}]
[2021-03-10 19:18:47.357470] E [resource(worker /data/storage_b/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-10 19:18:47.383949] E [resource(worker /data/storage_a/storage):1312:service_loop] GLUSTER: Changelog History Crawl failed [{error=[Errno 0] Success}]
[2021-03-10 19:18:48.255340] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_b/storage}]
[2021-03-10 19:18:48.260052] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-10 19:18:48.275651] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_c/storage}]
[2021-03-10 19:18:48.278064] I [monitor(monitor):228:monitor] Monitor: worker died in startup phase [{brick=/data/storage_a/storage}]
[2021-03-10 19:18:48.280453] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-10 19:18:48.282274] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Faulty}]
[2021-03-10 19:18:58.275702] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-10 19:18:58.276041] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_b/storage}, {slave_node=10.0.231.82}]
[2021-03-10 19:18:58.296252] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-10 19:18:58.296506] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_c/storage}, {slave_node=10.0.231.82}]
[2021-03-10 19:18:58.301290] I [gsyncdstatus(monitor):248:set_worker_status] GeorepStatus: Worker Status Change [{status=Initializing...}]
[2021-03-10 19:18:58.301521] I [monitor(monitor):160:monitor] Monitor: starting gsyncd worker [{brick=/data/storage_a/storage}, {slave_node=10.0.231.81}]
[2021-03-10 19:18:58.345817] I [resource(worker /data/storage_b/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:58.361268] I [resource(worker /data/storage_c/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:58.367985] I [resource(worker /data/storage_a/storage):1387:connect_remote] SSH: Initializing SSH connection between master and slave...
[2021-03-10 19:18:59.115143] I [subcmds(monitor-status):29:subcmd_monitor_status] <top>: Monitor Status Change [{status=Stopped}]

It seems like there is an issue selecting the changelogs - perhaps similar to this issue?  https://github.com/gluster/glusterfs/issues/1766

[root@storage01 storage_10.0.231.81_pcic-backup]# cat changes-data-storage_a-storage.log
[2021-03-10 19:18:45.284764] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_a/storage}, {notify_filter=1}]
[2021-03-10 19:18:45.285275] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-10 19:18:45.285269] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-10 19:18:45.286615] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 21
[2021-03-10 19:18:47.308607] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666553}, {end=1615403927}]
[2021-03-10 19:18:47.308659] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1250878}]
[2021-03-10 19:18:47.383774] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for="" {start=1615403927}, {idx=1250877}]

[root@storage01 storage_10.0.231.81_pcic-backup]# tail -7 changes-data-storage_b-storage.log
[2021-03-10 19:18:45.263211] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=3}]
[2021-03-10 19:18:45.263151] I [MSGID: 132028] [gf-changelog.c:577:gf_changelog_register_generic] 0-gfchangelog: Registering brick [{brick=/data/storage_b/storage}, {notify_filter=1}]
[2021-03-10 19:18:45.263294] I [MSGID: 101190] [event-epoll.c:670:event_dispatch_epoll_worker] 0-epoll: Started thread with index [{index=2}]
[2021-03-10 19:18:45.264598] I [socket.c:929:__socket_server_bind] 0-socket.gfchangelog: closing (AF_UNIX) reuse check socket 23
[2021-03-10 19:18:47.281499] I [MSGID: 132035] [gf-history-changelog.c:837:gf_history_changelog] 0-gfchangelog: Requesting historical changelogs [{start=1614666552}, {end=1615403927}]
[2021-03-10 19:18:47.281551] I [MSGID: 132019] [gf-history-changelog.c:755:gf_changelog_extract_min_max] 0-gfchangelog: changelogs min max [{min=1597342860}, {max=1615403927}, {total_changelogs=1258258}]
[2021-03-10 19:18:47.357244] E [MSGID: 132009] [gf-history-changelog.c:941:gf_history_changelog] 0-gfchangelog: wrong result [{for="" {start=1615403927}, {idx=1258257}]

Any ideas on where to debug this? I'd prefer not to have to remove and re-sync everything as there is about 240TB on the cluster...

Thanks,
 -Matthew


________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users






________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux