large omap object in usage_log_pool

shubjero <shubjero@xxxxxxxxx> · Thu, 23 May 2019 11:13:55 -0400

Hi there,

We have an old cluster that was built on Giant that we have maintained and upgraded over time and are now running Mimic 13.2.5. The other day we received a HEALTH_WARN about 1 large omap object in the pool '.usage' which is our usage_log_pool defined in our radosgw zone.

I am trying to understand the purpose of the usage_log_pool and whether or not we have appropriate settings (shards, replicas, etc) in place.

We were able to identify the 1 large omap object as 'usage.22' in the .usage pool. This particular "bucket" had over 2 million "omapkeys"

```for i in `rados -p .usage ls`; do echo $i; rados -p .usage listomapkeys $i | wc -l; done```
-snip-
usage.13
20
usage.22
2023790
usage.25
14
-snip-

These keys all seem to be metadata/pointers of valid data from our OpenStack's object storage where we hold about 1PB of unique data.

To resolve the HEALTH_WARN we changeg the 'osd_deep_scrub_large_omap_object_key_threshold' from '2000000' to '2500000' using 'ceph config set osd ...' on our Mon's.

I'd like to know the importance of this pool as I also noticed that this pool's replication is only set to 2, instead of 3 like all our other pools with the exception of .users.email (also 2). If important, I'd like to set the replication to 3 and curious to know if there would be any negative impact to the cluster. The .usage pool says 0 bytes used in 'ceph df' but it contains 30 objects for which there are many omapkeys.

I am also wondering about bucket index max shards for which we have '8' set in the config.
```    "rgw_override_bucket_index_max_shards": "8",```. Should this be increased?

Thanks in advance for any responses, I have found this mailing list to be an excellent source of information!

Jared Baker
Ontario Institute for Cancer Research
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com