Re: rgw.none and large num_objects

"Szabo, Istvan (Agoda)" <Istvan.Szabo@xxxxxxxxx> · Mon, 25 Apr 2022 04:43:12 +0000

You can also try this:
radosgw-admin bucket check --fix -b <bucket>

This will fix it temporarily.

Istvan Szabo
Senior Infrastructure Engineer
---------------------------------------------------
Agoda Services Co., Ltd.
e: istvan.szabo@xxxxxxxxx
---------------------------------------------------

-----Original Message-----
From: Christopher Durham <caduceus42@xxxxxxx>
Sent: Tuesday, April 19, 2022 4:07 AM
To: ceph-users@xxxxxxx
Subject:  rgw.none and large num_objects

Email received from the internet. If in doubt, don't click any link nor open any attachment !
________________________________

All:

I am using ceph pacific 16.2.7 on Rocky Linux 8.5 in a multi-site configuration, with two sites that replicate s3 in both directions. This cluster has been upgradedfrom nautilus over several years as new releases have become available. The mnigrate2rocky script provided by Rocky for migration from CentOSworked fine on my systems with some minor changes such as local repos.

On one of the two sites, I have the large omap object warning. Searching the log, I tracked this down to a large omap object in a particular bucket's index.
The stats for that bucket (radosgw-admin bucket stats --bucket bucketname), show a rgw_none entry in usage, with a num_objects of 18446744073709551592.I am uncertain when this bucket first gave that rgw_none entry with a large num_objects, but it is probable that it first occurred before we upgraded to pacific.

I am aware that the num_objects with that value represents a negative number as an unsigned 64 bit integer.

I have found elsewhere that the 'correct' procedure in this case to eliminate the large omap object warning is to disable sync for the bucket, disable all rgw on bothsites of the multisite config, reshard bucket on master, purge bucket on slave, restart rgw and restart sync. This is unacceptable on my system.
The 'good' thing here is that the bucket actually has zero objects in it, as represented in rgw_main (in the stats output) as well as the awscli.
It should be noted that the other site in the multisite configuration does not have the large omap object warning and does not have an rgw_none section in the stats output, and also shows zero objects.
So, I am tempted to just delete (and recreate) the bucket, and perhaps the large omap object issue will go away. Am I correct? Is it ok to just delete and recreate the bucket?Will deleting such a bucket create orphan objects in the pool as well?

Thanks-Chris
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx

________________________________
This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx