Hi John, Can you provide your zonegroup and zones configurations on all 3 rgw? (run the commands on each rgw) Thanks, Orit On Wed, Sep 21, 2016 at 11:14 PM, John Rowe <john.rowe@xxxxxxxxxxxxxx> wrote: > Hello, > > We have 2 Ceph clusters running in two separate data centers, each one with > 3 mons, 3 rgws, and 5 osds. I am attempting to get bi-directional multi-site > replication setup as described in the ceph documentation here: > http://docs.ceph.com/docs/jewel/radosgw/multisite/ > > We are running Jewel v 10.2.2: > rpm -qa | grep ceph > ceph-base-10.2.2-0.el7.x86_64 > ceph-10.2.2-0.el7.x86_64 > ceph-radosgw-10.2.2-0.el7.x86_64 > libcephfs1-10.2.2-0.el7.x86_64 > python-cephfs-10.2.2-0.el7.x86_64 > ceph-selinux-10.2.2-0.el7.x86_64 > ceph-mon-10.2.2-0.el7.x86_64 > ceph-osd-10.2.2-0.el7.x86_64 > ceph-release-1-1.el7.noarch > ceph-common-10.2.2-0.el7.x86_64 > ceph-mds-10.2.2-0.el7.x86_64 > > It appears syncing is happening, however it is not able to sync the > metadata, and therefore no users/buckets from the primary are going to the > secondary. > Primary sync status: > radosgw-admin sync status > realm 3af93a86-916a-490f-b38f-17922b472b19 (my_realm) > zonegroup 235b010c-22e2-4b43-8fcc-8ae01939273e (us) > zone 6c830b44-4e39-4e19-9bd8-03c37c2021f2 (us-dfw) > metadata sync no sync (zone is master) > data sync source: 58aa3eef-fc1f-492c-a08e-9c6019e7c266 (us-phx) > syncing > full sync: 0/128 shards > incremental sync: 128/128 shards > data is caught up with source > > radosgw-admin data sync status --source-zone=us-phx > { > "sync_status": { > "info": { > "status": "sync", > "num_shards": 128 > }, > "markers": [ > ... > } > > radosgw-admin metadata sync status > { > "sync_status": { > "info": { > "status": "init", > "num_shards": 0, > "period": "", > "realm_epoch": 0 > }, > "markers": [] > }, > "full_sync": { > "total": 0, > "complete": 0 > } > } > Secondary sync status: > radosgw-admin sync status > realm 3af93a86-916a-490f-b38f-17922b472b19 (pardot) > zonegroup 235b010c-22e2-4b43-8fcc-8ae01939273e (us) > zone 58aa3eef-fc1f-492c-a08e-9c6019e7c266 (us-phx) > metadata sync failed to read sync status: (2) No such file or directory > data sync source: 6c830b44-4e39-4e19-9bd8-03c37c2021f2 (us-dfw) > syncing > full sync: 0/128 shards > incremental sync: 128/128 shards > data is behind on 10 shards > oldest incremental change not applied: 2016-09-20 > 15:00:17.0.330225s > > radosgw-admin data sync status --source-zone=us-dfw > { > "sync_status": { > "info": { > "status": "building-full-sync-maps", > "num_shards": 128 > }, > .... > } > > radosgw-admin metadata sync status --source-zone=us-dfw > ERROR: sync.read_sync_status() returned ret=-2 > > > > In the logs I am seeing (date stamps not in order to pick out non-dupes): > Primary logs: > 2016-09-20 15:02:44.313204 7f2a2dffb700 0 ERROR: > client_io->complete_request() returned -5 > 2016-09-20 10:31:57.501247 7faf4bfff700 0 ERROR: failed to wait for op, > ret=-11: POST > http://pardot0-cephrgw1-3-phx.ops.sfdc.net:80/admin/realm/period?period=385c44c7-0506-4204-90d7-9d26a6cbaad2&epoch=12&rgwx-zonegroup=f46ce11b-ee5d-489b-aa30-752fc5353931 > 2016-09-20 10:32:03.391118 7fb12affd700 0 ERROR: failed to fetch datalog > info > 2016-09-20 10:32:03.491520 7fb12affd700 0 ERROR: lease cr failed, done > early > > Secondary logs; > 2016-09-20 10:28:15.290050 7faab2fed700 0 ERROR: failed to get bucket > instance info for bucket > id=BUCKET1:78301214-35bb-41df-a77f-24968ee4b3ff.104293.1 > 2016-09-20 10:28:15.290108 7faab5ff3700 0 ERROR: failed to get bucket > instance info for bucket > id=BUCKET1:78301214-35bb-41df-a77f-24968ee4b3ff.104293.1 > 2016-09-20 10:28:15.290571 7faab77f6700 0 ERROR: failed to get bucket > instance info for bucket > id=BUCKET1:78301214-35bb-41df-a77f-24968ee4b3ff.104293.1 > 2016-09-20 10:28:15.304619 7faaad7e2700 0 ERROR: failed to get bucket > instance info for bucket > id=BUCKET1:78301214-35bb-41df-a77f-24968ee4b3ff.104293.1 > 2016-09-20 10:28:38.169629 7fa98bfff700 0 ERROR: failed to distribute cache > for .rgw.root:periods.385c44c7-0506-4204-90d7-9d26a6cbaad2.12 > 2016-09-20 10:28:38.169642 7fa98bfff700 -1 period epoch 12 is not newer than > current epoch 12, discarding update > 2016-09-21 03:19:01.550808 7fe10bfff700 0 rgw meta sync: ERROR: failed to > fetch mdlog info > 2016-09-21 15:45:09.799195 7fcd677fe700 0 ERROR: failed to fetch remote > data log info: ret=-11 > > Each of those logs are repeated several times each, and constantly. > > Any help would be greatly appreciated. Thanks! > > > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com