Thanks for your response Stefan,
On 21/12/2021 10:07, Stefan Schueffler wrote:
Even without adding a lot of rgw objects (only a few PUTs per minute), we have thousands and thousands of rgw bucket.sync log entries in the rgw log pool (this seems to be a separate problem), and as such we accumulate „large omap objects“ over time.
Since you are doing RADOSGW as well, those OMAP objects are usually
bucket index files
(https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects
<https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects>).
Since there is no dynamic resharing
(https://docs.ceph.com/en/latest/radosgw/dynamicresharding/#rgw-dynamic-bucket-index-resharding)
until Quincy
(https://tracker.ceph.com/projects/rgw/issues?utf8=%E2%9C%93&set_filter=1&f%5B%5D=cf_3&op%5Bcf_3%5D=%3D&v%5Bcf_3%5D%5B%5D=multisite-reshard&f%5B%5D=&c%5B%5D=project&c%5B%5D=tracker&c%5B%5D=status&c%5B%5D=priority&c%5B%5D=subject&c%5B%5D=assigned_to&c%5B%5D=updated_on&c%5B%5D=category&c%5B%5D=fixed_version&c%5B%5D=cf_3&group_by=&t%5B%5D=)
you need to have enough shards created for each bucket by default.
At about 200k objects (~ keys) per shards you should reveive this
warning otherwise (used to be 2mio, see
https://github.com/ceph/ceph/pull/29175/files).
we also face the same or at least a very similar problem. We are running pacific (16.2.6 and 16.2.7, upgraded from 16.2.x to y to z) on both sides of the rgw multisite. In our case, the scrub errors occur on the secondary side only
Regarding your scrub errors. You do have those still coming up at random?
Could you check with "list-inconsistent-obj" if yours are within the
OMAP data and in the metadata pools only?
Regards
Christian
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx