Re: radosgw multi site different period

Casey Bodley <cbodley@xxxxxxxxxx> · Wed, 15 Nov 2017 12:21:54 -0500

Your period configuration is indeed consistent between zones. This 
"master is on a different period" error is specific to the metadata sync 
status. It's saying that zone b is unable to finish syncing the metadata 
changes from zone a that occurred during the previous period. Even 
though zone b was the master during that period, it needs to re-sync 
from zone a to make sure everyone ends up with a consistent view (even 
if this results in the loss of metadata changes).

It sounds like zone a was re-promoted to master before it had a chance 
to catch up completely. The docs offer some guidance [1] to avoid this 
situation, but you can recover on zone b by running `radosgw-admin 
metadata sync init` and restarting its gateways to restart a full sync.

[1] 
http://docs.ceph.com/docs/luminous/radosgw/multisite/#changing-the-metadata-master-zone

On 11/15/2017 02:56 AM, Kim-Norman Sahm wrote:
both cluster are in the same epoch and period:

root@ceph-a-1:~# radosgw-admin period get-current
{
     "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current
{
     "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

but the sync state is still "master is on a different period":

root@ceph-b-1:~# radosgw-admin sync status
           realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
       zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
            zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
   metadata sync syncing
                 full sync: 0/64 shards
                 master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
                 incremental sync: 64/64 shards
                 metadata is caught up with master
       data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
                         syncing
                         full sync: 0/128 shards
                         incremental sync: 128/128 shards
                         data is caught up with source

Am Dienstag, den 14.11.2017, 18:21 +0100 schrieb Kim-Norman Sahm:
both cluster are in the same epoch and period:

root@ceph-a-1:~# radosgw-admin period get-current
{
     "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

root@ceph-b-1:~# radosgw-admin period get-current
{
     "current_period": "b7392c41-9cbe-4d92-ad03-db607dd7d569"
}

Am Dienstag, den 14.11.2017, 17:05 +0000 schrieb David Turner:
I'm assuming you've looked at the period in both places `radosgw-
admin period get` and confirmed that the second site is behind the
master site (based on epochs).  I'm also assuming (since you linked
the instructions) that you've done `radosgw-admin period pull` on
the
second site to get any period updates that have been done to the
master site.

If my assumptions are wrong.  Then you should do those things.  If
my
assumptions are correct, then running `radosgw-admin period update
--
commit` on the the master site and `radosgw-admin period pull` on
the
second site might fix this.  If you've already done that as well
(as
they're steps in the article you linked), then you need someone
smarter than I am to chime in.

On Tue, Nov 14, 2017 at 11:35 AM Kim-Norman Sahm <kisahm@t-online.d
e>
wrote:
hi,

i've installed a ceph multi site setup with two ceph clusters and
each
one radosgw.
the multi site setup was in sync, so i tried a failover.
cluster A is going down and i've changed the zone (b) on cluster
b
to
the new master zone.
it's working fine.

now i start the cluster A and try to switch back the master zone
to
A.
cluster A believes that he is the master, cluster b is secondary.
but on the secondary is a different period and the bucket delta
is
not
synced to the new master zone:

root@ceph-a-1:~# radosgw-admin sync status
           realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
       zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
            zone 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
   metadata sync no sync (zone is master)
       data sync source: 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
                         syncing
                         full sync: 0/128 shards
                         incremental sync: 128/128 shards
                         data is caught up with source

root@ceph-b-1:~# radosgw-admin sync status
           realm 833e65be-268f-42c2-8f3c-9bab83ebbff2 (myrealm)
       zonegroup 15550dc6-a761-473f-81e8-0dc6cc5106bd (ceph)
            zone 082cd970-bd25-4cbc-a5fd-20f3b3f9dbd2 (b)
   metadata sync syncing
                 full sync: 0/64 shards
                 master is on a different period:
master_period=b7392c41-9cbe-4d92-ad03-db607dd7d569
local_period=d306a847-77a6-4306-87c9-0bb4fa16cdc4
                 incremental sync: 64/64 shards
                 metadata is caught up with master
       data sync source: 51019cee-86fb-4b39-b6ba-282171c459c6 (a)
                         syncing
                         full sync: 0/128 shards
                         incremental sync: 128/128 shards
                         data is caught up with source

how can i force sync the period and the bucket deltas?
i've used this howto: http://docs.ceph.com/docs/master/radosgw/mu
lt
isit
e/

br Kim
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com