Hello Orit, Yes, this bug looks to correlate. Was this included in 10.2.3? I guess not as I have since updated to 10.2.3 but getting the same errors This bug talks about not retrying after a failure, however do you know why the sync fails in the first place? It seems that basically any object over 500k in size fails :( Kind regards, Ben Morrice ______________________________________________________________________ Ben Morrice | e: ben.morrice@xxxxxxx | t: +41-21-693-9670 EPFL ENT CBS BBP Biotech Campus Chemin des Mines 9 1202 Geneva Switzerland On 23/09/16 16:52, Orit Wasserman wrote: > Hi Ben, > It seems to be http://tracker.ceph.com/issues/16742. > It is being backported to jewel http://tracker.ceph.com/issues/16794, > you can try apply it and see if it helps you. > > Regards, > Orit > > On Fri, Sep 23, 2016 at 9:21 AM, Ben Morrice <ben.morrice@xxxxxxx> wrote: >> Hello all, >> >> I have two separate ceph (10.2.2) clusters and have configured multisite >> replication between the two. I can see some buckets get synced, however >> others do not. >> >> Both clusters are RHEL7, and I have upgraded libcurl from 7.29 to 7.50 >> (to avoid http://tracker.ceph.com/issues/15915). >> >> Below is some debug output on the 'secondary' zone (bbp-gva-secondary) >> after uploading a file to the bucket 'bentest1' from onto the master >> zone (bbp-gva-master). >> >> This appears to to be happening very frequently. The size of my bucket >> pool in the master is ~120GB, however on the secondary site it's only >> 5GB so things are not very happy at the moment. >> >> What steps can I take to work out why RGW cannot create a lock in the >> log pool? >> >> Is there a way to force a full sync, starting fresh (the secondary site >> is not advertised to users, thus it's okay to even clean pools to start >> again)? >> >> >> 2016-09-23 09:03:28.498292 7f992e664700 20 execute(): read data: >> [{"key":6,"val":["bentest1:bbp-gva-master.85732351.16:-1"]}] >> 2016-09-23 09:03:28.498453 7f992e664700 20 execute(): modified >> key=bentest1:bbp-gva-master.85732351.16:-1 >> 2016-09-23 09:03:28.498456 7f992e664700 20 wakeup_data_sync_shards: >> source_zone=bbp-gva-master, >> shard_ids={6=bentest1:bbp-gva-master.85732351.16:-1} >> 2016-09-23 09:03:28.498547 7f9a72ffd700 20 incremental_sync(): async >> update notification: bentest1:bbp-gva-master.85732351.16:-1 >> 2016-09-23 09:03:28.499137 7f9a7dffb700 20 get_system_obj_state: >> rctx=0x7f9a3c5f8e08 >> obj=.bbp-gva-secondary.log:bucket.sync-status.bbp-gva-master:bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a0c069848 s->prefetch_data=0 >> 2016-09-23 09:03:28.501379 7f9a72ffd700 20 operate(): sync status for >> bucket bentest1:bbp-gva-master.85732351.16:-1: 2 >> 2016-09-23 09:03:28.501433 7f9a877fe700 20 reading from >> .bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> 2016-09-23 09:03:28.501447 7f9a877fe700 20 get_system_obj_state: >> rctx=0x7f9a877fc6d0 >> obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a340cfbe8 s->prefetch_data=0 >> 2016-09-23 09:03:28.503269 7f9a877fe700 20 get_system_obj_state: >> rctx=0x7f9a877fc6d0 >> obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a340cfbe8 s->prefetch_data=0 >> 2016-09-23 09:03:28.510428 7f9a72ffd700 20 sending request to >> https://bbpobjectstorage.epfl.ch:443/admin/log?bucket-instance=bentest1%3Abbp-gva-master.85732351.16&format=json&marker=00000000034.4578.3&type=bucket-index&rgwx-zonegroup=bbp-gva >> 2016-09-23 09:03:28.625755 7f9a72ffd700 20 [inc sync] skipping object: >> bentest1:bbp-gva-master.85732351.16:-1/1m: non-complete operation >> 2016-09-23 09:03:28.625759 7f9a72ffd700 20 [inc sync] syncing object: >> bentest1:bbp-gva-master.85732351.16:-1/1m >> 2016-09-23 09:03:28.625831 7f9a72ffd700 20 bucket sync single entry >> (source_zone=bbp-gva-master) >> b=bentest1(@{i=.bbp-gva-secondary.rgw.buckets.index,e=.bbp-gva-master.rgw.buckets.extra}.bbp-gva-secondary.rgw.buckets[bbp-gva-master.85732351.16]):-1/1m[0] >> log_entry=00000000036.4586.3 op=0 op_state=1 >> 2016-09-23 09:03:28.625857 7f9a72ffd700 5 bucket sync: sync obj: >> bbp-gva-master/bentest1(@{i=.bbp-gva-secondary.rgw.buckets.index,e=.bbp-gva-master.rgw.buckets.extra}.bbp-gva-secondary.rgw.buckets[bbp-gva-master.85732351.16])/1m[0] >> 2016-09-23 09:03:28.626092 7f9a85ffb700 20 get_obj_state: >> rctx=0x7f9a85ff96a0 obj=bentest1:1m state=0x7f9a30051cf8 s->prefetch_data=0 >> 2016-09-23 09:03:28.626119 7f9a72ffd700 20 sending request to >> https://bbpobjectstorage.epfl.ch:443/admin/log?bucket-instance=bentest1%3Abbp-gva-master.85732351.16&format=json&marker=00000000036.4586.3&type=bucket-index&rgwx-zonegroup=bbp-gva >> 2016-09-23 09:03:28.627560 7f9a85ffb700 10 get_canon_resource(): >> dest=/bentest1/1m >> /bentest1/1m >> 2016-09-23 09:03:28.627612 7f9a85ffb700 20 sending request to >> https://bbpobjectstorage.epfl.ch:443/bentest1/1m?rgwx-zonegroup=bbp-gva&rgwx-prepend-metadata=bbp-gva >> 2016-09-23 09:03:28.725185 7f9a72ffd700 20 incremental_sync:1067: >> shard_id=6 log_entry: 1_1474614207.373384_1713810.1:2016-09-23 >> 09:03:27.0.373384s:bentest1:bbp-gva-master.85732351.16 >> 2016-09-23 09:03:28.725477 7f9a9affd700 20 get_system_obj_state: >> rctx=0x7f9a3c5f8e08 >> obj=.bbp-gva-secondary.log:bucket.sync-status.bbp-gva-master:bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a741bb0a8 s->prefetch_data=0 >> 2016-09-23 09:03:28.728404 7f9a72ffd700 20 operate(): sync status for >> bucket bentest1:bbp-gva-master.85732351.16:-1: 2 >> 2016-09-23 09:03:28.728462 7f9a7b7f6700 20 reading from >> .bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> 2016-09-23 09:03:28.728490 7f9a7b7f6700 20 get_system_obj_state: >> rctx=0x7f9a7b7f46d0 >> obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a000b19b8 s->prefetch_data=0 >> 2016-09-23 09:03:28.729664 7f9a7b7f6700 20 get_system_obj_state: >> rctx=0x7f9a7b7f46d0 >> obj=.bbp-gva-secondary.domain.rgw:.bucket.meta.bentest1:bbp-gva-master.85732351.16 >> state=0x7f9a000b19b8 s->prefetch_data=0 >> 2016-09-23 09:03:28.731703 7f9a72ffd700 20 >> cr:s=0x7f9a3c5a4f90:op=0x7f9a3ca75ef0:20RGWContinuousLeaseCR: couldn't >> lock >> .bbp-gva-secondary.log:bucket.sync-status.bbp-gva-master:bentest1:bbp-gva-master.85732351.16:sync_lock: >> retcode=-16 >> 2016-09-23 09:03:28.731721 7f9a72ffd700 0 ERROR: incremental sync on >> bentest1 bucket_id=bbp-gva-master.85732351.16 shard_id=-1 failed, >> retcode=-16 >> 2016-09-23 09:03:28.758421 7f9a72ffd700 20 store_marker(): updating >> marker >> marker_oid=bucket.sync-status.bbp-gva-master:bentest1:bbp-gva-master.85732351.16 >> marker=00000000035.4585.2 >> 2016-09-23 09:03:28.829207 7f9a72ffd700 0 ERROR: failed to sync object: >> bentest1:bbp-gva-master.85732351.16:-1/1m >> 2016-09-23 09:03:28.834281 7f9a72ffd700 20 store_marker(): updating >> marker >> marker_oid=bucket.sync-status.bbp-gva-master:bentest1:bbp-gva-master.85732351.16 >> marker=00000000036.4586.3 >> >> >> >> -- >> Kind regards, >> >> Ben Morrice >> >> ______________________________________________________________________ >> Ben Morrice | e: ben.morrice@xxxxxxx | t: +41-21-693-9670 >> EPFL ENT CBS BBP >> Biotech Campus >> Chemin des Mines 9 >> 1202 Geneva >> Switzerland >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com