Hi, I hit problem while testing geo-replication. Anybody knows how to fix it except deleting and recreating geo-replication? Geo-replication failed to delete from slave file partially written to master volume. Have geo-replication between two nodes that are running glusterfs 3.7.16 with master volume: [root@SC-182 log]# gluster volume info master-for-183-0003 Volume Name: master-for-183-0003 Type: Distribute Volume ID: 84501a83-b07c-4768-bfaa-418b038e1a9e Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.10.60.182:/exports/nas-segment-0012/master-for-183-0003 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on server.allow-insecure: on performance.quick-read: off performance.stat-prefetch: off nfs.disable: on nfs.addr-namelookup: off performance.readdir-ahead: on cluster.enable-shared-storage: enable snap-activate-on-create: enable and slave volume: [root@SC-183 log]# gluster volume info rem-volume-0001 Volume Name: rem-volume-0001 Type: Distribute Volume ID: 7680de7a-d0e2-42f2-96a9-4da29adba73c Status: Started Number of Bricks: 1 Transport-type: tcp Bricks: Brick1: 10.10.60.183:/exports/nas183-segment-0001/rem-volume-0001 Options Reconfigured: performance.readdir-ahead: on nfs.addr-namelookup: off nfs.disable: on performance.stat-prefetch: off performance.quick-read: off server.allow-insecure: on snap-activate-on-create: enable Master volume mounted on node: [root@SC-182 log]# mount 127.0.0.1:/master-for-183-0003 on /samba/master-for-183-0003 type fuse.glusterfs (rw,allow_other,max_read=131072) Let's fill up space on master volume: [root@SC-182 log]# mkdir /samba/master-for-183-0003/cifs_share/dir3 [root@SC-182 log]# cp big.file /samba/master-for-183-0003/cifs_share/dir3/ [root@SC-182 log]# cp big.file /samba/master-for-183-0003/cifs_share/dir3/big.file.1 cp: writing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': No space left on device cp: closing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': No space left on device File " big.file.1" represent part of the original file: [root@SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/* -rwx------ 1 root root 78930370 Dec 5 16:49 /samba/master-for-183-0003/cifs_share/dir3/big.file -rwx------ 1 root root 22155264 Dec 5 16:49 /samba/master-for-183-0003/cifs_share/dir3/big.file.1 Both new files are geo-replicated to the Slave volume successfully: [root@SC-183 log]# ls -l /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/ total 98720 -rwx------ 2 root root 78930370 Dec 5 16:49 big.file -rwx------ 2 root root 22155264 Dec 5 16:49 big.file.1 [root@SC-182 log]# /usr/sbin/gluster volume geo-replication master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ------ 10.10.60.182 master-for-183-0003 /exports/nas-segment-0012/master-for-183-0003 nasgorep nasgorep@10.10.60.183::rem-volume-0001 10.10.60.183 Active Changelog Crawl 2016-12-05 16:49:48 0 0 0 0 N/A N/A N/A Let's delete partially written file from the master mount: [root@SC-182 log]# rm /samba/master-for-183-0003/cifs_share/dir3/big.file.1 rm: remove regular file `/samba/master-for-183-0003/cifs_share/dir3/big.file.1'? y [root@SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/* -rwx------ 1 root root 78930370 Dec 5 16:49 /samba/master-for-183-0003/cifs_share/dir3/big.file Set checkpoint: 32643 12/05/2016 16:57:46.540390536 1480985866 command: /usr/sbin/gluster volume geo-replication master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 config checkpoint now 2>&1 32643 12/05/2016 16:57:48.770820909 1480985868 status=0 /usr/sbin/gluster volume geo-replication master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 config checkpoint now 2>&1 Check geo-replication status: [root@SC-182 log]# /usr/sbin/gluster volume geo-replication master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ENTRY DATA META FAILURES CHECKPOINT TIME CHECKPOINT COMPLETED CHECKPOINT COMPLETION TIME ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- ---------- 10.10.60.182 master-for-183-0003 /exports/nas-segment-0012/master-for-183-0003 nasgorep nasgorep@10.10.60.183::rem-volume-0001 10.10.60.183 Active Changelog Crawl 2016-12-05 16:57:48 0 0 0 0 2016-12-05 16:57:46 Yes 2016-12-05 16:57:50 But the partially written file "big.file.1" is still present on the slave volume: [root@SC-183 log]# ls -l /exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/ total 98720 -rwx------ 2 root root 78930370 Dec 5 16:49 big.file -rwx------ 2 root root 22155264 Dec 5 16:49 big.file.1 Gluster logs for geo-replication do not have any indication about failure to delete the file: [root@SC-182 log]# view /var/log/glusterfs/geo-replication/master-for-183-0003/ssh%3A%2F%2Fnasgorep% 4010.10.60.183%3Agluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log [2016-12-06 00:49:40.267956] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 17 crawls, 1 turns [2016-12-06 00:49:52.348413] I [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster: slave's time: (1480985358, 0) [2016-12-06 00:49:53.296811] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:53.901186] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:54.760957] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:55.384705] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:55.987873] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:56.848361] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:57.471925] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:58.76416] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:58.935801] W [master(/exports/nas-segment-0012/master-for-183-0003):1058:process] _GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389 [2016-12-06 00:49:59.560571] E [resource(/exports/nas-segment-0012/master-for-183-0003):1021:rsync] SSH: SYNC Error(Rsync): rsync: rsync_xal_set: lsetxattr(".gfid/103b87ff-3b7a-4f2b-8bc5-a2f9c1d3fc0e","trusted.glusterfs.84 501a83-b07c-4768-bfaa-418b038e1a9e.xtime") failed: Operation not permitted (1) [2016-12-06 00:49:59.560972] E [master(/exports/nas-segment-0012/master-for-183-0003):1037:process] _GMaster: changelogs CHANGELOG.1480985389 could not be processed completely - moving on... [2016-12-06 00:50:41.839792] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 18 crawls, 1 turns [2016-12-06 00:51:42.203411] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:52:42.600800] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:53:42.983913] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:54:43.381218] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:55:43.749927] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:56:44.113914] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:57:44.494354] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns [2016-12-06 00:57:48.528424] I [gsyncd(conf):671:main_i] <top>: checkpoint 1480985866 set [2016-12-06 00:57:48.528704] I [syncdutils(conf):220:finalize] <top>: exiting. [2016-12-06 00:57:50.530714] I [master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster: slave's time: (1480985388, 0) [2016-12-06 00:58:44.802122] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 1 turns [2016-12-06 00:59:45.181669] I [master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap] _GMaster: 20 crawls, 0 turns Best regards, Viktor Nosov _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users