Geo-replication failed to delete from slave file partially written to master volume.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I hit problem while testing geo-replication. Anybody knows how to fix it
except deleting and recreating geo-replication? 

Geo-replication failed to delete from slave file partially written to master
volume.

Have geo-replication between two nodes that are running glusterfs 3.7.16

with master volume:

[root@SC-182 log]# gluster volume info master-for-183-0003

Volume Name: master-for-183-0003
Type: Distribute
Volume ID: 84501a83-b07c-4768-bfaa-418b038e1a9e
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.10.60.182:/exports/nas-segment-0012/master-for-183-0003
Options Reconfigured:
changelog.changelog: on
geo-replication.ignore-pid-check: on
geo-replication.indexing: on
server.allow-insecure: on
performance.quick-read: off
performance.stat-prefetch: off
nfs.disable: on
nfs.addr-namelookup: off
performance.readdir-ahead: on
cluster.enable-shared-storage: enable
snap-activate-on-create: enable

and slave volume:

[root@SC-183 log]# gluster volume info rem-volume-0001

Volume Name: rem-volume-0001
Type: Distribute
Volume ID: 7680de7a-d0e2-42f2-96a9-4da29adba73c
Status: Started
Number of Bricks: 1
Transport-type: tcp
Bricks:
Brick1: 10.10.60.183:/exports/nas183-segment-0001/rem-volume-0001
Options Reconfigured:
performance.readdir-ahead: on
nfs.addr-namelookup: off
nfs.disable: on
performance.stat-prefetch: off
performance.quick-read: off
server.allow-insecure: on
snap-activate-on-create: enable

Master volume mounted on node:

[root@SC-182 log]# mount
127.0.0.1:/master-for-183-0003 on /samba/master-for-183-0003 type
fuse.glusterfs (rw,allow_other,max_read=131072)

Let's fill up space on master volume:

[root@SC-182 log]# mkdir /samba/master-for-183-0003/cifs_share/dir3
[root@SC-182 log]# cp big.file /samba/master-for-183-0003/cifs_share/dir3/
[root@SC-182 log]# cp big.file
/samba/master-for-183-0003/cifs_share/dir3/big.file.1
cp: writing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': No
space left on device
cp: closing `/samba/master-for-183-0003/cifs_share/dir3/big.file.1': No
space left on device

File " big.file.1" represent part of the original file:
[root@SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
-rwx------ 1 root root 78930370 Dec  5 16:49
/samba/master-for-183-0003/cifs_share/dir3/big.file
-rwx------ 1 root root 22155264 Dec  5 16:49
/samba/master-for-183-0003/cifs_share/dir3/big.file.1

Both new files are geo-replicated to the Slave volume successfully:

[root@SC-183 log]# ls -l
/exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
total 98720
-rwx------ 2 root root 78930370 Dec  5 16:49 big.file
-rwx------ 2 root root 22155264 Dec  5 16:49 big.file.1

[root@SC-182 log]# /usr/sbin/gluster volume geo-replication
master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 status detail

MASTER NODE     MASTER VOL             MASTER BRICK
SLAVE USER    SLAVE                                     SLAVE NODE
STATUS
CRAWL STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES
CHECKPOINT TIME    CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
------
10.10.60.182    master-for-183-0003
/exports/nas-segment-0012/master-for-183-0003    nasgorep
nasgorep@10.10.60.183::rem-volume-0001    10.10.60.183    Active
Changelog Crawl    2016-12-05 16:49:48    0        0       0       0
N/A                N/A                     N/A

Let's delete partially written file from the master mount:

[root@SC-182 log]# rm /samba/master-for-183-0003/cifs_share/dir3/big.file.1
rm: remove regular file
`/samba/master-for-183-0003/cifs_share/dir3/big.file.1'? y

[root@SC-182 log]# ls -l /samba/master-for-183-0003/cifs_share/dir3/*
-rwx------ 1 root root 78930370 Dec  5 16:49
/samba/master-for-183-0003/cifs_share/dir3/big.file

Set checkpoint:

32643 12/05/2016 16:57:46.540390536 1480985866 command: /usr/sbin/gluster
volume geo-replication master-for-183-0003
nasgorep@10.10.60.183::rem-volume-0001 config checkpoint now 2>&1
32643 12/05/2016 16:57:48.770820909 1480985868 status=0 /usr/sbin/gluster
volume geo-replication master-for-183-0003
nasgorep@10.10.60.183::rem-volume-0001 config checkpoint now 2>&1

Check geo-replication status:

[root@SC-182 log]# /usr/sbin/gluster volume geo-replication
master-for-183-0003 nasgorep@10.10.60.183::rem-volume-0001 status detail

MASTER NODE     MASTER VOL             MASTER BRICK
SLAVE USER    SLAVE                                     SLAVE NODE
STATUS
CRAWL STATUS       LAST_SYNCED            ENTRY    DATA    META    FAILURES
CHECKPOINT TIME        CHECKPOINT COMPLETED    CHECKPOINT COMPLETION TIME
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------
10.10.60.182    master-for-183-0003
/exports/nas-segment-0012/master-for-183-0003    nasgorep
nasgorep@10.10.60.183::rem-volume-0001    10.10.60.183    Active
Changelog Crawl    2016-12-05 16:57:48    0        0       0       0
2016-12-05 16:57:46    Yes                     2016-12-05 16:57:50

But the partially written file "big.file.1" is still present on the slave
volume:

[root@SC-183 log]# ls -l
/exports/nas183-segment-0001/rem-volume-0001/cifs_share/dir3/
total 98720
-rwx------ 2 root root 78930370 Dec  5 16:49 big.file
-rwx------ 2 root root 22155264 Dec  5 16:49 big.file.1

Gluster logs for geo-replication do not have any indication about failure to
delete the file:

[root@SC-182 log]# view
/var/log/glusterfs/geo-replication/master-for-183-0003/ssh%3A%2F%2Fnasgorep%
4010.10.60.183%3Agluster%3A%2F%2F127.0.0.1%3Arem-volume-0001.log

[2016-12-06 00:49:40.267956] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 17 crawls, 1 turns
[2016-12-06 00:49:52.348413] I
[master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
slave's time: (1480985358, 0)
[2016-12-06 00:49:53.296811] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:53.901186] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:54.760957] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:55.384705] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:55.987873] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:56.848361] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:57.471925] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:58.76416] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:58.935801] W
[master(/exports/nas-segment-0012/master-for-183-0003):1058:process]
_GMaster: incomplete sync, retrying changelogs: CHANGELOG.1480985389
[2016-12-06 00:49:59.560571] E
[resource(/exports/nas-segment-0012/master-for-183-0003):1021:rsync] SSH:
SYNC Error(Rsync): rsync: rsync_xal_set:
lsetxattr(".gfid/103b87ff-3b7a-4f2b-8bc5-a2f9c1d3fc0e","trusted.glusterfs.84
501a83-b07c-4768-bfaa-418b038e1a9e.xtime") failed: Operation not permitted
(1)
[2016-12-06 00:49:59.560972] E
[master(/exports/nas-segment-0012/master-for-183-0003):1037:process]
_GMaster: changelogs CHANGELOG.1480985389 could not be processed
completely - moving on...
[2016-12-06 00:50:41.839792] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 18 crawls, 1 turns
[2016-12-06 00:51:42.203411] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:52:42.600800] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:53:42.983913] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:54:43.381218] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:55:43.749927] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:56:44.113914] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:57:44.494354] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns
[2016-12-06 00:57:48.528424] I [gsyncd(conf):671:main_i] <top>: checkpoint
1480985866 set
[2016-12-06 00:57:48.528704] I [syncdutils(conf):220:finalize] <top>:
exiting.
[2016-12-06 00:57:50.530714] I
[master(/exports/nas-segment-0012/master-for-183-0003):1121:crawl] _GMaster:
slave's time: (1480985388, 0)
[2016-12-06 00:58:44.802122] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 1 turns
[2016-12-06 00:59:45.181669] I
[master(/exports/nas-segment-0012/master-for-183-0003):532:crawlwrap]
_GMaster: 20 crawls, 0 turns

Best regards,

Viktor Nosov





















_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users



[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux