Re: Geo-Replication - Changelog socket is not present - Falling back to xsync

Kotresh Hiremath Ravishankar <khiremat@xxxxxxxxxx> · Fri, 29 May 2015 05:59:49 -0400 (EDT)

Hi Cyril,

That's great, you could get it worked!!!
Sorry, you were on latest 3.7:)

Thanks and Regards,
Kotresh H R

----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)" <cyril.peponnet@xxxxxxxxxxxxxxxxxx>
> To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx>
> Cc: "gluster-users" <gluster-users@xxxxxxxxxxx>
> Sent: Friday, May 29, 2015 3:11:33 AM
> Subject: Re:  Geo-Replication - Changelog socket is not present - Falling back to xsync
> 
> So, to sum up, I finally found a workaround:
> 
> Get the diff between master and vol for data
> rsync -avn —delete src dst > liste.txt
> 
> From here I delete deleted files from the slave. It was easy.
> 
> Now update the mismatched gfid…
> 
> I removed the unsynced files from the slave
> 
> On master I do a cp myfile myfile__
> 
> And wait for changelog to process those new __ files.
> 
> Finaly mv myfile__ myfile to master and wait again for changelog to process
> the changes.
> 
> Hope fully only 4k files were impacted so it was pretty quick (few hours).
> 
> --
> Cyril Peponnet
> 
> On May 28, 2015, at 1:34 PM, Cyril Peponnet
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> wrote:
> 
> Oh and by the way, I’m using 3.5.2 so I don’t have the
> http://review.gluster.org/#/c/9370/ feature you add…
> --
> Cyril Peponnet
> 
> On May 28, 2015, at 8:54 AM, Cyril Peponnet
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> wrote:
> 
> Hi Kotresh,
> 
> Inline.
> 
> Again, thank for you time.
> 
> --
> Cyril Peponnet
> 
> On May 27, 2015, at 10:47 PM, Kotresh Hiremath Ravishankar
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>> wrote:
> 
> Hi Cyril,
> 
> Replies inline.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)"
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> To: "Kotresh Hiremath Ravishankar"
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>>
> Cc: "gluster-users"
> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx>>
> Sent: Wednesday, May 27, 2015 9:28:00 PM
> Subject: Re:  Geo-Replication - Changelog socket is not
> present - Falling back to xsync
> 
> Hi and thanks again for those explanation.
> 
> Due to lot of missing files and not up to date (with gfid mismatch some
> time), I reset the index (or I think I do) by:
> 
> deleting the geo-reop, reset geo-replication.indexing (set it to off does not
> work for me), and recreate it again.
> 
> Resetting index does not initiate geo-replication from the version changelog
> is
> introduced. It works only for the versions prior to it.
> 
> NOTE 1: Recreation of geo-rep session will work only if slave doesn't contain
>       file with mismatch gfids. If there are, slave should be cleaned up
>       before recreating.
> 
> I started it again to transfert missing files Ill take of gfid missmatch
> afterward. Our vol is almost 5TB and it took almost 2 month to crawl to the
> slave I did’nt want to start over :/
> 
> 
> NOTE 2: Another method exists now to initiate a full sync. It also expects
> slave
>         files should not be in gfid mismatch state (meaning, slave volume
>         should not
>         written by any other means other than geo-replication). The method is
>         to
>         reset stime on all the bricks of master.
> 
> 
>         Following are the steps to trigger full sync!!!. Let me know if any
>         comments/doubts.
>         ================================================
>         1. Stop geo-replication
>         2. Remove stime extended attribute all the master brick root using
>         following command.
>            setfattr -x
>            trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime
>            <brick-root>
>           NOTE: 1. If AFR is setup, do this for all replicated set
> 
>                 2. Above mentioned stime key can be got as follows:
>                    Using 'gluster volume info <mastervol>', get all brick
>                    paths and dump all the
>                    extended attributes, using 'getfattr -d -m . -e hex
>                    <brick-path>', which will
>                    dump stime key which should be removed.
> 
>                 3. The technique, re-triggers complete sync. It involves
>                 complete xsync crawl.
>                    If there are rename issues, it might hit the rsync error
>                    on complete re-sync as well.
>                    So it is recommended, if the problematic files on slaves
>                    are known, remove them and initiate
>                    complete sync.
> 
> Is complete sync will send again the data if present of not ? How to track
> down rename issue ? master is a living volume with lot of creation / rename
> / deletion.
> 
> 
>         3. Start geo-replicatoin.
> 
>         The above technique can also be used to trigger data sync only on one
>         particular brick.
>         Just removing stime extended attribute only on brick root of master
>         to be synced will
>         do. If AFR is setup, remove stime on all replicated set of bricks.
> 
>         ================================
> 
> 
> So for now it’s still in hybrid crawl process.
> 
> I end up with that because some entire folder where not synced up by the
> first hybrid crawl (and touch does nothing afterward in changelog). In fact
> touch anyfile doesnt trigger any resync, only delete/rename/change do.
> 
> 
>     In newer geo-replication, from the version history crawl is introduced,
>     xsync
> crawl is minimized. Once it reaches the timestamp where it gets the
> historical changelogs,
> it starts using history changelogs. Touch will be recorded as SETATTR in
> Changelog so
> Geo-rep will not sync the data. So the new virtual setattr interface is
> introduced
> which is mentioned in previous mail.
> 
> 1/
> 1. Directories:
>    #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
> 2. Files:
>    #setfattr -n glusterfs.geo-rep.trigger-sync -v “1" <file-path>
> 
> Is is recursive ? (for directories) or I have to do that on each mismatching
> files ? Should I do that on master or slave ?
> 
> 
> No, it is not recursive, it should be done for every missing files and
> directories.
> And directories should be done before the files inside it.
> It should be done on master.
> 
> 
> I don’t understand the difference between setfattr -n
> glusterfs.geo-rep.trigger-sync -v “1” <DIR> (vol level) and setfattr -x
> trusted.glusterfs.<MASTER_VOL_UUID>.<SLAVE_VOL_UUID>.stime <brick-root>
> (brick level)
> 
> 
> 2/ For the RO I can pass the Option: nfs.volume-access to read-only, this
> will pass the vol in RO for nfs mount and glusterfs mount. Correct ?
> 
> Yes, that should do.
> 
> Cool ! Thanks!
> 
> 
> Thank you so much for your help.
> --
> Cyril Peponnet
> 
> On May 26, 2015, at 11:29 PM, Kotresh Hiremath Ravishankar
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>> wrote:
> 
> Hi Cyril,
> 
> Need some clarifications. Comments inline.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)"
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> To: "Kotresh Hiremath Ravishankar"
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx>>
> Cc: "gluster-users"
> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx>>
> Sent: Tuesday, May 26, 2015 11:43:44 PM
> Subject: Re:  Geo-Replication - Changelog socket is not
> present - Falling back to xsync
> 
> So, changelog is still active but I notice that some file were missing.
> 
> So I ‘m running a rsync -avn between the two vol (master and slave) to
> sync
> then again by touching the missing files (hopping geo-rep will do the
> rest).
> 
> Are you running rsync -avn for missed files between master and slave
> volumes ?
> If yes, that is dangerous and it should not be done. Geo-replication
> demands gfid
> of files between master and slave to be intact (meaning the gfid of
> 'file1' in
> master vol should be same as 'file1' in slave). It is required because,
> the data sync
> happens using 'gfid' not the 'pathname' of the file. So if manual rsync is
> used
> to sync files between master and slave using pathname, gfids will change
> and
> further syncing on those files fails through geo-rep.
> 
> A virtual setxattr interface is provided to sync missing files through
> geo-replication.
> It makes sure gfids are intact.
> 
> NOTE: Directories have to be synced to slave before trying setxattr for
> files inside it.
> 
> 1. Directories:
>    #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <DIR>
> 2. Files:
>    #setfattr -n glusterfs.geo-rep.trigger-sync -v "1" <file-path>
> 
> One question, can I pass the slave vol a RO ? Because if somebody change a
> file in the slave it’s no longer synced (changes and delete but rename
> keep
> synced between master and slave).
> 
> Will it have an impact on geo-replication process if I pass the slave vol
> a
> RO ?
> 
> Again if slave volume is modified by something else other than geo-rep, we
> might
> end up in mismatch of gfids. So exposing the slave volume to consumers as
> RO is always
> a good idea. It doesn't affect geo-rep as it internally mounts in RW.
> 
> Hope this helps. Let us know if anything else. We are happy to help you.
> 
> Thanks again.
> 
> 
> --
> Cyril Peponnet
> 
> On May 25, 2015, at 12:43 AM, Kotresh Hiremath Ravishankar
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx><mailto:khiremat@xxxxxxxxxx>>
> wrote:
> 
> Hi Cyril,
> 
> Answers inline
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)"
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx><mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> To: "Kotresh Hiremath Ravishankar"
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx><mailto:khiremat@xxxxxxxxxx>>
> Cc: "gluster-users"
> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx><mailto:gluster-users@xxxxxxxxxxx>>
> Sent: Friday, May 22, 2015 9:34:47 PM
> Subject: Re:  Geo-Replication - Changelog socket is not
> present - Falling back to xsync
> 
> One last question, correct me if I’m wrong.
> 
> When you start a geo-rep process it starts with xsync aka hybrid crawling
> (sending files every 60s, with files windows set as 8192 files per sent).
> 
> When the crawl is done it should use changelog detector and dynamically
> change things to slaves.
> 
> 1/ During the hybride crawl, if we delete files from master (and they were
> already transfered to the slave), xsync process will not delete them from
> the slave (and we can’t change as the option as is hardcoded).
> When it will pass to changelog, will it remove the non existent folders
> and
> files on the slave that are no longer on the master ?
> 
> 
> You are right, xsync does not sync delete files, once it is already
> synced.
> After xsync, when it switches to changelog, it doesn't delete all the non
> existing
> entries on slave that are no longer on the master. Changelog is capable of
> deleting
> files from the time it got switched to changelog.
> 
> 2/ With changelog, if I add a file of 10GB and after a file of 1KB, will
> the
> changelog process with queue (waiting for the 10GB file to be sent) or are
> the sent done in thread ?
> (ex I add a 10GB file and I delete it after 1min, what will happen ?)
> 
> Changelog records the operations happened in master and is replayed by
> geo-replication
> on to slave volume. Geo-replication syncs files in two phases.
> 
> 1. Phase-1: Create entries through RPC( 0 byte files on slave keeping
> gfid
> intact as in master)
> 2. Phase-2: Sync data, through rsync/tar_over_ssh (Multi threaded)
> 
> Ok, now keeping that in mind, Phase-1 happens serially, and the phase two
> happens parallely.
> Zero byte files of 10GB and 1KB gets created on slave serially and data
> for
> the same syncs
> parallely. Another thing to remember, geo-rep makes sure that, syncing
> data
> to file is tried
> only after zero byte file for the same is created already.
> 
> 
> In latest release 3.7, xsync crawl is minimized by the feature called
> history
> crawl introduced in 3.6.
> So the chances of missing deletes/renames are less.
> 
> Thanks.
> 
> --
> Cyril Peponnet
> 
> On May 21, 2015, at 10:22 PM, Kotresh Hiremath Ravishankar
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx><mailto:khiremat@xxxxxxxxxx>>
> wrote:
> 
> Great, hope that should work. Let's see
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)"
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx><mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> To: "Kotresh Hiremath Ravishankar"
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx><mailto:khiremat@xxxxxxxxxx>>
> Cc: "gluster-users"
> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx><mailto:gluster-users@xxxxxxxxxxx>>
> Sent: Friday, May 22, 2015 5:31:13 AM
> Subject: Re:  Geo-Replication - Changelog socket is not
> present - Falling back to xsync
> 
> Thanks to JoeJulian / Kaushal I managed to re-enable the changelog option
> and
> the socket is now present.
> 
> For the record I had some clients running rhs gluster-fuse and our nodes
> are
> running glusterfs release and op-version are not “compatible”.
> 
> Now I have to wait for the init crawl see if it switches to changelog
> detector mode.
> 
> Thanks Kotresh
> --
> Cyril Peponnet
> 
> On May 21, 2015, at 8:39 AM, Cyril Peponnet
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx><mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> wrote:
> 
> Hi,
> 
> Unfortunately,
> 
> # gluster vol set usr_global changelog.changelog off
> volume set: failed: Staging failed on
> mvdcgluster01.us.alcatel-lucent.com<http://mvdcgluster01.us.alcatel-lucent.com/><http://mvdcgluster01.us.alcatel-lucent.com<http://mvdcgluster01.us.alcatel-lucent.com/>>.
> Error: One or more connected clients cannot support the feature being
> set.
> These clients need to be upgraded or disconnected before running this
> command again
> 
> 
> I don’t know really why, I have some clients using 3.6 as fuse client
> others are running on 3.5.2.
> 
> Any advice ?
> 
> --
> Cyril Peponnet
> 
> On May 20, 2015, at 5:17 AM, Kotresh Hiremath Ravishankar
> <khiremat@xxxxxxxxxx<mailto:khiremat@xxxxxxxxxx><mailto:khiremat@xxxxxxxxxx>>
> wrote:
> 
> Hi Cyril,
> 
> From the brick logs, it seems the changelog-notifier thread has got
> killed
> for some reason,
> as notify is failing with EPIPE.
> 
> Try the following. It should probably help:
> 1. Stop geo-replication.
> 2. Disable changelog: gluster vol set <master-vol-name>
> changelog.changelog off
> 3. Enable changelog: glluster vol set <master-vol-name>
> changelog.changelog on
> 4. Start geo-replication.
> 
> Let me know if it works.
> 
> Thanks and Regards,
> Kotresh H R
> 
> ----- Original Message -----
> From: "Cyril N PEPONNET (Cyril)"
> <cyril.peponnet@xxxxxxxxxxxxxxxxxx<mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx><mailto:cyril.peponnet@xxxxxxxxxxxxxxxxxx>>
> To: "gluster-users"
> <gluster-users@xxxxxxxxxxx<mailto:gluster-users@xxxxxxxxxxx><mailto:gluster-users@xxxxxxxxxxx>>
> Sent: Tuesday, May 19, 2015 3:16:22 AM
> Subject:  Geo-Replication - Changelog socket is not
> present - Falling back to xsync
> 
> Hi Gluster Community,
> 
> I have a 3 nodes setup at location A and a two node setup at location
> B.
> 
> All running 3.5.2 under Centos-7.
> 
> I have one volume I sync through georeplication process.
> 
> So far so good, the first step of geo-replication is done
> (hybrid-crawl).
> 
> Now I’d like to use the change log detector in order to delete files on
> the
> slave when they are gone on master.
> 
> But it always fallback to xsync mecanism (even when I force it using
> config
> changelog_detector changelog):
> 
> [2015-05-18 12:29:49.543922] I [monitor(monitor):129:monitor] Monitor:
> ------------------------------------------------------------
> [2015-05-18 12:29:49.544018] I [monitor(monitor):130:monitor] Monitor:
> starting gsyncd worker
> [2015-05-18 12:29:49.614002] I [gsyncd(/export/raid/vol):532:main_i]
> <top>:
> syncing: gluster://localhost:vol ->
> ssh://root@x.x.x.x:gluster://localhost:vol
> [2015-05-18 12:29:54.696532] I
> [master(/export/raid/vol):58:gmaster_builder]
> <top>: setting up xsync change detection mode
> [2015-05-18 12:29:54.696888] I [master(/export/raid/vol):357:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2015-05-18 12:29:54.697930] I
> [master(/export/raid/vol):58:gmaster_builder]
> <top>: setting up changelog change detection mode
> [2015-05-18 12:29:54.698160] I [master(/export/raid/vol):357:__init__]
> _GMaster: using 'rsync' as the sync engine
> [2015-05-18 12:29:54.699239] I [master(/export/raid/vol):1104:register]
> _GMaster: xsync temp directory:
> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/xsync
> [2015-05-18 12:30:04.707216] I
> [master(/export/raid/vol):682:fallback_xsync]
> _GMaster: falling back to xsync mode
> [2015-05-18 12:30:04.742422] I
> [syncdutils(/export/raid/vol):192:finalize]
> <top>: exiting.
> [2015-05-18 12:30:05.708123] I [monitor(monitor):157:monitor] Monitor:
> worker(/export/raid/vol) died in startup phase
> [2015-05-18 12:30:05.708369] I [monitor(monitor):81:set_state] Monitor:
> new
> state: faulty
> [201
> 
> After some python debugging and stack strace printing I figure out
> that:
> 
> /var/run/gluster/vol/ssh%3A%2F%2Froot%40x.x.x.x%3Agluster%3A%2F%2F127.0.0.1%3Avol/ce749a38ba30d4171cd674ec00ab24f9/changes.log
> 
> [2015-05-18 19:41:24.511423] I
> [gf-changelog.c:179:gf_changelog_notification_init] 0-glusterfs:
> connecting
> to changelog socket:
> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
> (brick:
> /export/raid/vol)
> [2015-05-18 19:41:24.511445] W
> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
> connection
> attempt 1/5...
> [2015-05-18 19:41:26.511556] W
> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
> connection
> attempt 2/5...
> [2015-05-18 19:41:28.511670] W
> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
> connection
> attempt 3/5...
> [2015-05-18 19:41:30.511790] W
> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
> connection
> attempt 4/5...
> [2015-05-18 19:41:32.511890] W
> [gf-changelog.c:189:gf_changelog_notification_init] 0-glusterfs:
> connection
> attempt 5/5...
> [2015-05-18 19:41:34.512016] E
> [gf-changelog.c:204:gf_changelog_notification_init] 0-glusterfs: could
> not
> connect to changelog socket! bailing out...
> 
> 
> /var/run/gluster/changelog-ce749a38ba30d4171cd674ec00ab24f9.sock
> doesn’t
> exist. So the
> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L431
> is failing because
> https://github.com/gluster/glusterfs/blob/release-3.5/xlators/features/changelog/lib/src/gf-changelog.c#L153
> cannot open the socket file.
> 
> And I don’t find any error related to changelog in log files, except on
> brick
> logs node 2 (site A)
> 
> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636908] E
> [changelog-helpers.c:168:changelog_rollover_changelog] 0-vol-changelog:
> Failed to send file name to notify thread (reason: Broken pipe)
> bricks/export-raid-vol.log-20150517:[2015-05-14 17:06:52.636949] E
> [changelog-helpers.c:280:changelog_handle_change] 0-vol-changelog:
> Problem
> rolling over changelog(s)
> 
> gluster vol status is all fine, and change-log options are enabled in
> vol
> file
> 
> volume vol-changelog
> type features/changelog
> option changelog on
> option changelog-dir /export/raid/vol/.glusterfs/changelogs
> option changelog-brick /export/raid/vol
> subvolumes vol-posix
> end-volume
> 
> Any help will be appreciated :)
> 
> Oh Btw, hard to stop / restart the volume as I have around 4k clients
> connected.
> 
> Thanks !
> 
> --
> Cyril Peponnet
> 
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users