Resetting RGW Federated replication

yehuda@xxxxxxxxxx (Yehuda Sadeh) · Tue, 23 Sep 2014 22:06:30 -0700

On Tue, Sep 23, 2014 at 4:54 PM, Craig Lewis <clewis at centraldesktop.com> wrote:
> I've had some issues in my secondary cluster.  I'd like to restart
> replication from the beginning, without destroying the data in the secondary
> cluster.
>
> Reading the radosgw-agent and Admin REST API code, I believe I just need to
> stop replication, delete the secondary zone's log_pool, recreate the
> log_pool, and restart replication.
>
> Anybody have any thoughts?  I'm still setting up some VMs to test this,
> before I try it in production.
>
>
>
> Background:
> I'm on Emperor (yeah, still need to upgrade).  I believe I ran into
> http://tracker.ceph.com/issues/7595 .  My read of that patch is that it
> prevents the problem from occurring, but doesn't correct corrupt data.  I
> tried applying some of the suggested patches, but they only ignored the
> error, rather than correcting it.  I finally dropped the corrupt pool.  That
> allowed the stock Emperor binaries to run without crashing.  The pool I
> dropped was my secondary zone's log_pool.
>
> Before I dropped the pool, I copied all of the objects to local disk.  After
> re-creating the pool, I uploaded the objects.
>
> Now replication is kinda of working, but not correctly.  I have a number of
> buckets that are being written to in the primary cluster, but no replication
> is occurring.  radosgw-agent says a number of shards have >= 1000 log
> entries, but then it never processes the buckets in those shards.
>
> Looking back at the pool's contents on local disk, all of the files are 0
> bytes.  So I'm assuming all of the important state was stored in the
> object's metadata.
>
> I'd like to completely zero out the replication state, then exploit a
> feature in radosgw-agent 1.1 that will only replicate the first 1000 objects
> in buckets, if the bucket isn't being actively written to.  Then I can
> restart radosgw-agent 1.2, and let it catch up the active buckets.  That'll
> save me many weeks and TB of replication.
>
> Obviously, I'll compare bucket listings between the two clusters when I'm
> done.  I'll probably try to catch up the read-only bucket's state at a later
> date.
>

I don't really understand what happened here. Maybe start with trying
to understand why the sync agent isn't replicating anymore? Maybe the
replicalog markers are off?

Yehuda