On Tue, 19 Feb 2019 at 09:59, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > On Wed, 6 Feb 2019 at 09:28, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > On Tue, 5 Feb 2019 at 10:04, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > > > On Tue, 5 Feb 2019 at 09:46, Iain Buclaw <ibuclaw@xxxxxxxxxx> wrote: > > > > > > > > Hi, > > > > > > > > Following the update of one secondary site from 12.2.8 to 12.2.11, the > > > > following warning have come up. > > > > > > > > HEALTH_WARN 1 large omap objects > > > > LARGE_OMAP_OBJECTS 1 large omap objects > > > > 1 large objects found in pool '.rgw.buckets.index' > > > > Search the cluster log for 'Large omap object found' for more details. > > > > > > > > > > [...] > > > > > > > Is this the reason why resharding hasn't propagated? > > > > > > > > > > Furthermore, infact it looks like the index is broken on the secondaries. > > > > > > On the master: > > > > > > # radosgw-admin bi get --bucket=mybucket --object=myobject > > > { > > > "type": "plain", > > > "idx": "myobject", > > > "entry": { > > > "name": "myobject", > > > "instance": "", > > > "ver": { > > > "pool": 28, > > > "epoch": 8848 > > > }, > > > "locator": "", > > > "exists": "true", > > > "meta": { > > > "category": 1, > > > "size": 9200, > > > "mtime": "2018-03-27 21:12:56.612172Z", > > > "etag": "c365c324cda944d2c3b687c0785be735", > > > "owner": "mybucket", > > > "owner_display_name": "Bucket User", > > > "content_type": "application/octet-stream", > > > "accounted_size": 9194, > > > "user_data": "" > > > }, > > > "tag": "0ef1a91a-4aee-427e-bdf8-30589abb2d3e.36603989.137292", > > > "flags": 0, > > > "pending_map": [], > > > "versioned_epoch": 0 > > > } > > > } > > > > > > > > > On the secondaries: > > > > > > # radosgw-admin bi get --bucket=mybucket --object=myobject > > > ERROR: bi_get(): (2) No such file or directory > > > > > > How does one go about rectifying this mess? > > > > > > > Random blog in language I don't understand seems to allude to using > > radosgw-admin bi put to restore backed up indexes, but not under what > > circumstances you would use such a command. > > > > https://cloud.tencent.com/developer/article/1032854 > > > > Would this be safe to run on secondaries? > > > > Removed the bucket on the secondaries and scheduled new sync. However > this gets stuck at some point and radosgw is complaining about: > > data sync: WARNING: skipping data log entry for missing bucket > mybucket:0ef1a91a-4aee-427e-bdf8-30589abb2d3e.92151615.1:21 > > Hopeless that RGW can't even do a simple job right, I removed the > problematic bucket on the master, but now there are now hundreds of > shard objects inside the index pool, all look to be orphaned, and > still the warnings for missing bucket continue to happen on the > secondaries. In some cases there's an object on the secondary that > doesn't exist on the master. > > All the while, ceph is still complaining about large omap files. > > $ ceph daemon mon.ceph-mon-1 config get > osd_deep_scrub_large_omap_object_value_sum_threshold > { > "osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824" > } > > It seems implausible that the cluster is still complaining about this > when the largest omap contains 71405 entries. > > > I can't run bi purge or metadata rm on the unreferenced entries > because the bucket itself is no more. Can I remove objects from the > index pool using 'rados rm' ? > Possibly related http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031350.html -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0'; _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com