Re: Random scrub errors (omap_digest_mismatch) on pgs of RADOSGW metadata pools (bug 53663)

Stefan Schueffler <s.schueffler@xxxxxxxxxxxxx> · Tue, 21 Dec 2021 23:38:24 +0000

Hi Christian,

essentially we only have one meaningful bucket „jobpostings“. This bucket has 7001 shards (so, theoretically room for up to 1.4 bn files). In our first tests with the default 11 shards, we ran in the large object-problem with too many files per shard - so we decided to delete the bucket and create a new one with 7001 shards to be future proof :-).
Currently, there are roughly 40m files in this shard - so, there should be no "large omap“ warning caused by too many files per shard.

Instead, we have
ceph -s
1 large objects found in pool 'de-dus5.rgw.log

The large object always appears in the rgw-log-pool. Checking this pool with

rados -p de-dus5.rgw.log ls

we see a lot of objects similar to this:

bucket.sync-status.da0efdbd-b177-48e2-883b-49a71fb5a27c:jobpostings:da0efdbd-b177-48e2-883b-49a71fb5a27c.102762984.2:129

Those bucket.sync-status.xyz<http://bucket.sync-status.xyz> - objects are created with a frequency of several tens of thousands per day. If we do not delete them manually, the large object error appears after some days.

The other Problem, regarding the OSD scrub errors, we have this:

ceph health detail shows „PG_DAMAGED: Possible data damage: x pgs inconsistent.“
Every now and then new pgs get inconsistent. All inconsistent pgs belong to the buckets-index-pool  de-dus5.rgw.buckets.index

ceph health detail
pg 136.1 is active+clean+inconsistent, acting [8,3,0]

rados -p de-dus5.rgw.buckets.index list-inconsistent-obj 136.1
No scrub information available for pg 136.1
error 2: (2) No such file or directory

rados list-inconsistent-obj 136.1
No scrub information available for pg 136.1
error 2: (2) No such file or directory

ceph pg deep-scrub 136.1
instructing pg 136.1 on osd.8 to deep-scrub

… until now nothing changed, the list-inconsistent-obj does not show any information (did i miss some cli arguments?)

Ususally, we simply do a
ceph pg repair 136.1
which most of the time silently does whatever it is supposed to do, and the error disappears. Shortly after, it reappears at random, with some other (or the same) pg out of the rgw.buckets.index - pool…

Regards,
Stefan

Am 21.12.2021 um 12:15 schrieb Christian Rohmann <christian.rohmann@xxxxxxxxx<mailto:christian.rohmann@xxxxxxxxx>>:

Thanks for your response Stefan,

On 21/12/2021 10:07, Stefan Schueffler wrote:
Even without adding a lot of rgw objects (only a few PUTs per minute), we have thousands and thousands of rgw bucket.sync log entries in the rgw log pool (this seems to be a separate problem), and as such we accumulate „large omap objects“ over time.
Since you are doing RADOSGW as well, those OMAP objects are usually bucket index files (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects). Since there is no dynamic resharing (https://docs.ceph.com/en/latest/radosgw/dynamicresharding/#rgw-dynamic-bucket-index-resharding) until Quincy (https://tracker.ceph.com/projects/rgw/issues?utf8=%E2%9C%93&set_filter=1&f%5B%5D=cf_3&op%5Bcf_3%5D=%3D&v%5Bcf_3%5D%5B%5D=multisite-reshard&f%5B%5D=&c%5B%5D=project&c%5B%5D=tracker&c%5B%5D=status&c%5B%5D=priority&c%5B%5D=subject&c%5B%5D=assigned_to&c%5B%5D=updated_on&c%5B%5D=category&c%5B%5D=fixed_version&c%5B%5D=cf_3&group_by=&t%5B%5D=) you need to have enough shards created for each bucket by default.

At about 200k objects (~ keys) per shards you should reveive this warning otherwise (used to be 2mio, see https://github.com/ceph/ceph/pull/29175/files).

we also face the same or at least a very similar  problem. We are running pacific (16.2.6 and 16.2.7, upgraded from 16.2.x to y to z) on both sides of the rgw multisite. In our case, the scrub errors occur on the secondary side only
Regarding your scrub errors. You do have those still coming up at random?
Could you check with "list-inconsistent-obj" if yours are within the OMAP data and in the metadata pools only?

Regards

Christian

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx