Re: Question about geo-replication and deletes in 3.5 beta train

Venky Shankar <yknev.shankar@xxxxxxxxx> · Thu, 24 Apr 2014 00:39:08 +0530

That should not happen. After a replica failover the "now" active node should continue where the "old" active node left off.

Could you provide geo-replication logs from master and slave after reproducing this (with changelog mode).

Thanks,

-venky

On Thu, Apr 17, 2014 at 9:34 PM, CJ Beck <chris.beck@xxxxxxxxxxx> wrote:

I did set it intentionally because I found a case where files would be missed during geo-replication. Xsync seemed to handle the case better. The issue was when you bring the “Active” node down that is handling the geo-replication session, and it’s set
 to ChangeLog as the change method. Any files that are written into the cluster while geo-replication is down (eg, while the geo-replication session is being failed to another node), are missed / skipped, and won’t ever be transferred to the other cluster.

Is this the expected behavior? If not, then I can open a bug on it.

-CJ

From: Venky Shankar <yknev.shankar@xxxxxxxxx>

Date: Wednesday, April 16, 2014 at 4:43 PM

To: CJ Beck <chris.beck@xxxxxxxxxxx>

Cc: "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>

Subject: Re:  Question about geo-replication and deletes in 3.5 beta train

On Thu, Apr 17, 2014 at 3:01 AM, CJ Beck 
<chris.beck@xxxxxxxxxxx> wrote:

I did have the “change_detector” set to xsync, which seems to be the issue (bypassing the changelog method). So I can fix that and see if the deletes are propagated.

Was that set intentionally? Setting this as the main change detection mechanism would crawl the filesystem every 60 seconds to replicate the changes. Changelog mode handles live changes,
 so any deletes that were performed before this option was set would not be propagated.

Also, is there a way to tell the geo-replication to go ahead and walk the filesystems to do a “sync” so the remote side files are deleted, if they are not on the source?

As of now, no. With distributed geo-replication, the geo-rep daemon crawls the bricks (instead of the mount). Since the brick would have a subset of the file system entities (for e.g.
 in a distributed volume), it's hard to find out purged entries without having to crawl the mount and comparing the entries b/w master and slave (which is slow). This is where changelog mode helps.

Thanks for the quick reply!

[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc status detail

MASTER NODE               MASTER VOL    MASTER BRICK     SLAVE                     STATUS     CHECKPOINT STATUS    CRAWL STATUS    FILES SYNCD    FILES PENDING    BYTES PENDING    DELETES PENDING    FILES SKIPPED

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
host1.com    test-poc       /data/test-poc    10.10.1.120::test-poc    Passive    N/A                  N/A             382            0                0                0                  0

host2.com    test-poc       /data/test-poc    10.10.1.122::test-poc    Passive    N/A                  N/A             0              0                0                0                  0

host3.com    test-poc       /data/test-poc    10.10.1.121::test-poc    Active     N/A                  Hybrid Crawl    10765          70               0                0                  0

From: Venky Shankar <yknev.shankar@xxxxxxxxx>

Date: Wednesday, April 16, 2014 at 1:54 PM

To: CJ Beck <chris.beck@xxxxxxxxxxx>

Cc: "gluster-users@xxxxxxxxxxx" <gluster-users@xxxxxxxxxxx>

Subject: Re:  Question about geo-replication and deletes in 3.5 beta train

"ignore-deletes" is only valid in the initial crawl mode[1] where it does not propagate deletes to the slave (changelog mode does). Was the session restarted by any chance?

[1] Geo-replication now has two internal operations modes: a one shot filesystem crawl mode (used to replicate data already present in a volume) and the changelog mode (for replicating live changes).

Thanks,

-venky

On Thu, Apr 17, 2014 at 1:25 AM, CJ Beck 
<chris.beck@xxxxxxxxxxx> wrote:

I have an issue where deletes are not being propagated to the slave cluster in a geo-replicated environment. I’ve looked through the code, and it appears as though this is something that might have been changed to
 be hard coded?

When I try to change it via a config option on the command line, it replies with a “reserved option” error:

[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc config ignore_deletes 1
Reserved option
geo-replication command failed
[root@host ~]# gluster volume geo-replication test-poc 10.10.1.120::test-poc config ignore-deletes 1
Reserved option
geo-replication command failed
[root@host ~]#

Looking at the source code (although, I’m not a C expert by any means), it seems as though it’s hard-coded to be “true” all the time?

(from glusterd-geo-rep.c):

4285         /* ignore-deletes */
4286         runinit_gsyncd_setrx (&runner, conf_path);
4287         runner_add_args (&runner, "ignore-deletes", "true", ".", ".", NULL);
4288         RUN_GSYNCD_CMD;

Any ideas how to get deletes propagated to the slave cluster?

Thanks!

-CJ

_______________________________________________

Gluster-users mailing list

Gluster-users@xxxxxxxxxxx

http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://supercolony.gluster.org/mailman/listinfo/gluster-users