Thanks for your response (6 months ago!) but I have only just got around to following up on this.
Unfortunately, I had already copied and shipped the data to the second datacenter before copying the GFIDs so I already stumbled before the first hurdle!
I have been using the scripts in the extras/geo-rep provided for an earlier version upgrade. With a bit of tinkering, these have given me a file contianing the gfid/file pairs needed to sync to the slave before enabling georeplication.
Unfortunately, the gsync-sync-gfid program isn't working. On all the files it reports that they failed and I see the following in the fuse log:
[2017-12-21 16:36:37.171846] D [MSGID: 0] [dht-common.c:997:dht_revalidate_cbk] 0-video-backup-dht: revalidate lookup of /path returned with op_ret 0 [Invalid argument] [2017-12-21 16:36:37.172352] D [fuse-helpers.c:650:fuse_ignore_xattr_set] 0-glusterfs-fuse: allowing setxattr: key [glusterfs.gfid.heal], client pid [0] [2017-12-21 16:36:37.172457] D [logging.c:1953:_gf_msg_internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [dht-common.c:997:dht_revalidate_cbk] 0-video-backup-dht: revalidate lookup of /path returned with op_ret 0 [Invalid argument]" repeated 8 times between [2017-12-21 16:36:37.171846] and [2017-12-21 16:36:37.172225] [2017-12-21 16:36:37.172457] D [MSGID: 0] [dht-common.c:2692:dht_lookup] 0-video-backup-dht: Calling fresh lookup for /path/file.txt on video-backup-client-4 [2017-12-21 16:36:37.173166] D [MSGID: 0] [client-rpc-fops.c:2910:client3_3_lookup_cbk] 0-video-backup-client-4: gfid changed for /path/file.txt [2017-12-21 16:36:37.173212] D [MSGID: 0] [client-rpc-fops.c:2941:client3_3_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup-client-4 returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173233] D [MSGID: 0] [dht-common.c:2279:dht_lookup_cbk] 0-video-backup-dht: fresh_lookup returned for /path/file.txt with op_ret -1 [Stale file handle] [2017-12-21 16:36:37.173250] D [MSGID: 0] [dht-common.c:2359:dht_lookup_cbk] 0-video-backup-dht: Lookup of /path/file.txt for subvolume video-backup-client-4 failed [Stale file handle] [2017-12-21 16:36:37.173267] D [MSGID: 0] [dht-common.c:2422:dht_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup-dht returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173285] D [MSGID: 0] [io-cache.c:256:ioc_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup-io-cache returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173303] D [MSGID: 0] [quick-read.c:447:qr_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup-quick-read returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173320] D [MSGID: 0] [md-cache.c:863:mdc_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup-md-cache returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173336] D [MSGID: 0] [io-stats.c:2116:io_stats_lookup_cbk] 0-stack-trace: stack-address: 0x7fe39ebdc42c, video-backup returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173374] D [MSGID: 0] [gfid-access.c:390:ga_heal_cbk] 0-stack-trace: stack-address: 0x7fe39ebd7498, gfid-access-autoload returned -1 error: Stale file handle [Stale file handle] [2017-12-21 16:36:37.173405] W [fuse-bridge.c:1291:fuse_err_cbk] 0-glusterfs-fuse: 57862: SETXATTR() /path => -1 (Stale file handle)
I notice in slave-upgrade.sh, the .glusterfs contents on each brick are deleted, and the volume restarted before gsync-sync-gfid is run.
I have a good working backup at the moment and deleting the .glusterfs folder worries me. Is this the solution, or is something else wrong?
On 27 June 2017 at 06:31, Aravinda <avishwan@xxxxxxxxxx> wrote:
Answers inline,
@Kotresh, please add if I missed anything.
regards Aravinda VK http://aravindavk.inOn 06/23/2017 06:29 PM, Stephen Remde wrote:
Before start copying,I have a ~600tb distributed gluster volume that I want to start using geo replication on.
The current volume is on 6 100tb bricks on 2 servers
My plan is:
1) copy each of the bricks to a new arrays on the servers locally
- Enable changelog in Master volume, using `gluster volume set <volname> changelog.changelog on`
- Record the Timestamp. This is required to set the marking once Geo-rep session is created.
- Gluster Geo-replication replicates the data with same GFID in target, so if we are copying data manually we should ensure that GFIDs remains same and properly linked as hardlinks in the .glusterfs directory.2) move the new arrays to the new servers3) create the volume on the new servers using the arrays4) fix the layout on the new volume
After creating the Geo-replication session it will try to start from the beginning, to avoid that set the marking in each bricks before starting Geo-replication saying till this time data is synced to Slave. Once started, it will start from this timestamp.(I can provide script for the same)5) start georeplication (which should be relatively small as most of the data should be already there?)
Is this likely to succeed?Any advice welcomed.
--
Dr Stephen Remde
Director, Innovation and Research
T: 01535 280066
M: 07764 740920
E: stephen.remde@xxxxxxxxxxx
W: www.gaist.co.uk
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/ mailman/listinfo/gluster-users
Dr Stephen Remde
Director, Innovation and Research
T: 01535 280066
M: 07764 740920
E: stephen.remde@xxxxxxxxxxx
W: www.gaist.co.uk
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users