Re: Geo-Replication - Changelog socket is not present - Falling back to xsync

"PEPONNET, Cyril N (Cyril)" <cyril.peponnet@xxxxxxxxxxxxxxxxxx> · Thu, 28 May 2015 15:54:54 +0000

Hi Kotresh,

Inline.

Again, thank for you time.

-- 
Cyril Peponnet

> On May 27, 2015, at 10:47 PM, Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> wrote:
> 
> Hi Cyril,
> 
> Replies inline.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
>> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet@xxxxxxxxxxxxxxxxxx>
>> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
>> Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
>> Sent: Wednesday, May 27, 2015 9:28:00 PM
>> Subject: Re:  Geo-Replication - Changelog socket is not present - Falling back to xsync
>> 
>> Hi and thanks again for those explanation.
>> 
>> Due to lot of missing files and not up to date (with gfid mismatch some
>> time), I reset the index (or I think I do) by:
>> 
>> deleting the geo-reop, reset geo-replication.indexing (set it to off does not
>> work for me), and recreate it again.
>> 
>  Resetting index does not initiate geo-replication from the version changelog is
>  introduced. It works only for the versions prior to it.
> 
>  NOTE 1: Recreation of geo-rep session will work only if slave doesn't contain
>        file with mismatch gfids. If there are, slave should be cleaned up
>        before recreating.

I started it again to transfert missing files Ill take of gfid missmatch afterward. Our vol is almost 5TB and it took almost 2 month to crawl to the slave I did’nt want to start over :/

> 
>  NOTE 2: Another method exists now to initiate a full sync. It also expects slave
>          files should not be in gfid mismatch state (meaning, slave volume should not
>          written by any other means other than geo-replication). The method is to
>          reset stime on all the bricks of master.
> 
> 
>          Following are the steps to trigger full sync!!!. Let me know if any comments/doubts.
>          ================================================
>          1. Stop geo-replication
>          2. Remove stime extended attribute all the master brick root using following command.
>             setfattr -x trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime <brick-root>
>            NOTE: 1. If AFR is setup, do this for all replicated set
> 
>                  2. Above mentioned stime key can be got as follows:
>                     Using 'gluster volume info <mastervol>', get all brick paths and dump all the
>                     extended attributes, using 'getfattr -d -m . -e hex <brick-path>', which will
>                     dump stime key which should be removed.
> 
>                  3. The technique, re-triggers complete sync. It involves complete xsync crawl.
>                     If there are rename issues, it might hit the rsync error on complete re-sync as well.
>                     So it is recommended, if the problematic files on slaves are known, remove them and initiate
>                     complete sync.

Is complete sync will send again the data if present of not ? How to track down rename issue ? master is a living volume with lot of creation / rename / deletion.

> 
>          3. Start geo-replicatoin.
> 
>          The above technique can also be used to trigger data sync only on one particular brick.
>          Just removing stime extended attribute only on brick root of master to be synced will
>          do. If AFR is setup, remove stime on all replicated set of bricks.
> 
>          ================================
> 
> 
>> So for now it’s still in hybrid crawl process.
>> 
>> I end up with that because some entire folder where not synced up by the
>> first hybrid crawl (and touch does nothing afterward in changelog). In fact
>> touch anyfile doesnt trigger any resync, only delete/rename/change do.
>> 
> 
>      In newer geo-replication, from the version history crawl is introduced, xsync
>  crawl is minimized. Once it reaches the timestamp where it gets the historical changelogs,
>  it starts using history changelogs. Touch will be recorded as SETATTR in Changelog so
>  Geo-rep will not sync the data. So the new virtual setattr interface is introduced
>  which is mentioned in previous mail.
> 
>> 1/
>>> 1. Directories:
>>>     #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
>>> 2. Files:
>>>     #setfattr -n glusterfs.geo-rep.trigger-sync -v “1" <file-path>
>> 
>> Is is recursive ? (for directories) or I have to do that on each mismatching
>> files ? Should I do that on master or slave ?
>> 
> 
>  No, it is not recursive, it should be done for every missing files and directories.
>  And directories should be done before the files inside it.
>  It should be done on master.

I don’t understand the difference between setfattr -n glusterfs.geo-rep.trigger-sync -v “1” <DIR> (vol level) and setfattr -x trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime <brick-root> (brick level)

> 
>> 2/ For the RO I can pass the Option: nfs.volume-access to read-only, this
>> will pass the vol in RO for nfs mount and glusterfs mount. Correct ?
>> 
>  Yes, that should do.

Cool ! Thanks!

> 
>> Thank you so much for your help.
>> --
>> Cyril Peponnet
>> 
>>> On May 26, 2015, at 11:29 PM, Kotresh Hiremath Ravishankar
>>> <khiremat@xxxxxxxxxx> wrote:
>>> 
>>> Hi Cyril,
>>> 
>>> Need some clarifications. Comments inline.
>>> 
>>> Thanks and Regards,
>>> Kotresh H R
>>> 
>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet@xxxxxxxxxxxxxxxxxx>
>>>> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
>>>> Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
>>>> Sent: Tuesday, May 26, 2015 11:43:44 PM
>>>> Subject: Re:  Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>> 
>>>> So, changelog is still active but I notice that some file were missing.
>>>> 
>>>> So I ‘m running a rsync -avn between the two vol (master and slave) to
>>>> sync
>>>> then again by touching the missing files (hopping geo-rep will do the
>>>> rest).
>>>> 
>>> Are you running rsync -avn for missed files between master and slave
>>> volumes ?
>>> If yes, that is dangerous and it should not be done. Geo-replication
>>> demands gfid
>>> of files between master and slave to be intact (meaning the gfid of
>>> 'file1' in
>>> master vol should be same as 'file1' in slave). It is required because,
>>> the data sync
>>> happens using 'gfid' not the 'pathname' of the file. So if manual rsync is
>>> used
>>> to sync files between master and slave using pathname, gfids will change
>>> and
>>> further syncing on those files fails through geo-rep.
>>> 
>>> A virtual setxattr interface is provided to sync missing files through
>>> geo-replication.
>>> It makes sure gfids are intact.
>>> 
>>> NOTE: Directories have to be synced to slave before trying setxattr for
>>> files inside it.
>>> 
>>> 1. Directories:
>>>     #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
>>> 2. Files:
>>>     #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <file-path>
>>> 
>>>> One question, can I pass the slave vol a RO ? Because if somebody change a
>>>> file in the slave it’s no longer synced (changes and delete but rename
>>>> keep
>>>> synced between master and slave).
>>>> 
>>>> Will it have an impact on geo-replication process if I pass the slave vol
>>>> a
>>>> RO ?
>>> 
>>> Again if slave volume is modified by something else other than geo-rep, we
>>> might
>>> end up in mismatch of gfids. So exposing the slave volume to consumers as
>>> RO is always
>>> a good idea. It doesn't affect geo-rep as it internally mounts in RW.
>>> 
>>> Hope this helps. Let us know if anything else. We are happy to help you.
>>>> 
>>>> Thanks again.
>>>> 
>>>> 
>>>> --
>>>> Cyril Peponnet
>>>> 
>>>> On May 25, 2015, at 12:43 AM, Kotresh Hiremath Ravishankar
>>>> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>> wrote:
>>>> 
>>>> Hi Cyril,
>>>> 
>>>> Answers inline
>>>> 
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>> 
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
>>>> To: "Kotresh Hiremath Ravishankar"
>>>> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>>
>>>> Cc: "gluster-users"
>>>> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx>>
>>>> Sent: Friday, May 22, 2015 9:34:47 PM
>>>> Subject: Re:  Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>> 
>>>> One last question, correct me if I’m wrong.
>>>> 
>>>> When you start a geo-rep process it starts with xsync aka hybrid crawling
>>>> (sending files every 60s, with files windows set as 8192 files per sent).
>>>> 
>>>> When the crawl is done it should use changelog detector and dynamically
>>>> change things to slaves.
>>>> 
>>>> 1/ During the hybride crawl, if we delete files from master (and they were
>>>> already transfered to the slave), xsync process will not delete them from
>>>> the slave (and we can’t change as the option as is hardcoded).
>>>> When it will pass to changelog, will it remove the non existent folders
>>>> and
>>>> files on the slave that are no longer on the master ?
>>>> 
>>>> 
>>>> You are right, xsync does not sync delete files, once it is already
>>>> synced.
>>>> After xsync, when it switches to changelog, it doesn't delete all the non
>>>> existing
>>>> entries on slave that are no longer on the master. Changelog is capable of
>>>> deleting
>>>> files from the time it got switched to changelog.
>>>> 
>>>> 2/ With changelog, if I add a file of 10GB and after a file of 1KB, will
>>>> the
>>>> changelog process with queue (waiting for the 10GB file to be sent) or are
>>>> the sent done in thread ?
>>>> (ex I add a 10GB file and I delete it after 1min, what will happen ?)
>>>> 
>>>> Changelog records the operations happened in master and is replayed by
>>>> geo-replication
>>>> on to slave volume. Geo-replication syncs files in two phases.
>>>> 
>>>> 1. Phase-1: Create entries through RPC( 0 byte files on slave keeping
>>>> gfid
>>>> intact as in master)
>>>> 2. Phase-2: Sync data, through rsync/tar_over_ssh (Multi threaded)
>>>> 
>>>> Ok, now keeping that in mind, Phase-1 happens serially, and the phase two
>>>> happens parallely.
>>>> Zero byte files of 10GB and 1KB gets created on slave serially and data
>>>> for
>>>> the same syncs
>>>> parallely. Another thing to remember, geo-rep makes sure that, syncing
>>>> data
>>>> to file is tried
>>>> only after zero byte file for the same is created already.
>>>> 
>>>> 
>>>> In latest release 3.7, xsync crawl is minimized by the feature called
>>>> history
>>>> crawl introduced in 3.6.
>>>> So the chances of missing deletes/renames are less.
>>>> 
>>>> Thanks.
>>>> 
>>>> --
>>>> Cyril Peponnet
>>>> 
>>>> On May 21, 2015, at 10:22 PM, Kotresh Hiremath Ravishankar
>>>> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>> wrote:
>>>> 
>>>> Great, hope that should work. Let's see
>>>> 
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>> 
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
>>>> To: "Kotresh Hiremath Ravishankar"
>>>> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>>
>>>> Cc: "gluster-users"
>>>> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx>>
>>>> Sent: Friday, May 22, 2015 5:31:13 AM
>>>> Subject: Re:  Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>> 
>>>> Thanks to JoeJulian / Kaushal I managed to re-enable the changelog option
>>>> and
>>>> the socket is now present.
>>>> 
>>>> For the record I had some clients running rhs gluster-fuse and our nodes
>>>> are
>>>> running glusterfs release and op-version are not “compatible”.
>>>> 
>>>> Now I have to wait for the init crawl see if it switches to changelog
>>>> detector mode.
>>>> 
>>>> Thanks Kotresh
>>>> --
>>>> Cyril Peponnet
>>>> 
>>>> On May 21, 2015, at 8:39 AM, Cyril Peponnet
>>>> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
>>>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> Unfortunately,
>>>> 
>>>> # gluster vol set usr_global changelog.changelog off
>>>> volume set: failed: Staging failed on
>>>> mvdcgluster01.us.alcatel-lucent.com<http://mvdcgluster01.us.alcatel-lucent.com>.
>>>> Error: One or more connected clients cannot support the feature being
>>>> set.
>>>> These clients need to be upgraded or disconnected before running this
>>>> command again
>>>> 
>>>> 
>>>> I don’t know really why, I have some clients using 3.6 as fuse client
>>>> others are running on 3.5.2.
>>>> 
>>>> Any advice ?
>>>> 
>>>> --
>>>> Cyril Peponnet
>>>> 
>>>> On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar
>>>> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>> wrote:
>>>> 
>>>> Hi Cyril,
>>>> 
>>>> From the brick logs, it seems the changelog-notifier thread has got
>>>> killed
>>>> for some reason,
>>>> as notify is failing with EPIPE.
>>>> 
>>>> Try the following. It should probably help:
>>>> 1. Stop geo-replication.
>>>> 2. Disable changelog: gluster vol set <master-vol-name>
>>>> changelog.changelog off
>>>> 3. Enable changelog: glluster vol set <master-vol-name>
>>>> changelog.changelog on
>>>> 4. Start geo-replication.
>>>> 
>>>> Let me know if it works.
>>>> 
>>>> Thanks and Regards,
>>>> Kotresh H R
>>>> 
>>>> ----- Original Message -----
>>>> From: "Cyril N PEPONNET (Cyril)"
>>>> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
>>>> To: "gluster-users"
>>>> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx>>
>>>> Sent: Tuesday, May 19, 2015 3:16:22 AM
>>>> Subject:  Geo-Replication - Changelog socket is not
>>>> present - Falling back to xsync
>>>> 
>>>> Hi Gluster Community,
>>>> 
>>>> I have a 3 nodes setup at location A and a two node setup at location
>>>> B.
>>>> 
>>>> All running 3.5.2 under Centos-7.
>>>> 
>>>> I have one volume I sync through georeplication process.
>>>> 
>>>> So far so good, the first step of geo-replication is done
>>>> (hybrid-crawl).
>>>> 
>>>> Now I’d like to use the change log detector in order to delete files on
>>>> the
>>>> slave when they are gone on master.
>>>> 
>>>> But it always fallback to xsync mecanism (even when I force it using
>>>> config
>>>> changelog_detector changelog):
>>>> 
>>>> [2015-05-18 12:29:49.543922] I [monitor(monitor):129:monitor] Monitor:
>>>> ------------------------------------------------------------
>>>> [2015-05-18 12:29:49.544018] I [monitor(monitor):130:monitor] Monitor:
>>>> starting gsyncd worker
>>>> [2015-05-18 12:29:49.614002] I [gsyncd(/export/raid/vol):532:main_i]
>>>> <top>:
>>>> syncing: gluster://localhost:vol ->
>>>> ssh://root@x.x.x.x:gluster://localhost:vol
>>>> [2015-05-18 12:29:54.696532] I
>>>> [master(/export/raid/vol):58:gmaster_builder]
>>>> <top>: setting up xsync change detection mode
>>>> [2015-05-18 12:29:54.696888] I [master(/export/raid/vol):357:__init__]
>>>> _GMaster: using 'rsync' as the sync engine
>>>> [2015-05-18 12:29:54.697930] I
>>>> [master(/export/raid/vol):58:gmaster_builder]
>>>> <top>: setting up changelog change detection mode
>>>> [2015-05-18 12:29:54.698160] I [master(/export/raid/vol):357:__init__]
>>>> _GMaster: using 'rsync' as the sync engine
>>>> [2015-05-18 12:29:54.699239] I [master(/export/raid/vol):1104:register]
>>>> _GMaster: xsync temp directory:
>>>> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/xsync
>>>> [2015-05-18 12:30:04.707216] I
>>>> [master(/export/raid/vol):682:fallback_xsync]
>>>> _GMaster: falling back to xsync mode
>>>> [2015-05-18 12:30:04.742422] I
>>>> [syncdutils(/export/raid/vol):192:finalize]
>>>> <top>: exiting.
>>>> [2015-05-18 12:30:05.708123] I [monitor(monitor):157:monitor] Monitor:
>>>> worker(/export/raid/vol) died in startup phase
>>>> [2015-05-18 12:30:05.708369] I [monitor(monitor):81:set_state] Monitor:
>>>> new
>>>> state: faulty
>>>> [201
>>>> 
>>>> After some python debugging and stack strace printing I figure out
>>>> that:
>>>> 
>>>> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/changes.log
>>>> 
>>>> [2015-05-18 19:41:24.511423] I
>>>> [gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs:
>>>> connecting
>>>> to changelog socket:
>>>> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
>>>> (brick:
>>>> /export/raid/vol)
>>>> [2015-05-18 19:41:24.511445] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 1/5...
>>>> [2015-05-18 19:41:26.511556] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 2/5...
>>>> [2015-05-18 19:41:28.511670] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 3/5...
>>>> [2015-05-18 19:41:30.511790] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 4/5...
>>>> [2015-05-18 19:41:32.511890] W
>>>> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
>>>> connection
>>>> attempt 5/5...
>>>> [2015-05-18 19:41:34.512016] E
>>>> [gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could
>>>> not
>>>> connect to changelog socket! bailing out...
>>>> 
>>>> 
>>>> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
>>>> doesn’t
>>>> exist. So the
>>>> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L431
>>>> is failing because
>>>> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L153
>>>> cannot open the socket file.
>>>> 
>>>> And I don’t find any error related to changelog in log files, except on
>>>> brick
>>>> logs node 2 (site A)
>>>> 
>>>> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636908] E
>>>> [changelog-helpers.c:168:changelog_rollover_changelog] 0-vol-changelog:
>>>> Failed to send file name to notify thread (reason: Broken pipe)
>>>> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636949] E
>>>> [changelog-helpers.c:280:changelog_handle_change] 0-vol-changelog:
>>>> Problem
>>>> rolling over changelog(s)
>>>> 
>>>> gluster vol status is all fine, and change-log options are enabled in
>>>> vol
>>>> file
>>>> 
>>>> volume vol-changelog
>>>> type features/changelog
>>>> option changelog on
>>>> option changelog-dir /export/raid/vol/.glusterfs/changelogs
>>>> option changelog-brick /export/raid/vol
>>>> subvolumes vol-posix
>>>> end-volume
>>>> 
>>>> Any help will be appreciated :)
>>>> 
>>>> Oh Btw, hard to stop / restart the volume as I have around 4k clients
>>>> connected.
>>>> 
>>>> Thanks !
>>>> 
>>>> --
>>>> Cyril Peponnet
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users@xxxxxxxxxxx
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>> 
>>>> 
>> 
>> 

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users