Hi, I am currently dealing with a cluster that's been in use for 5 years and during that time, has never had its radosgw usage log trimmed. Now that the cluster has been upgraded to Nautilus (and has completed a full deep-scrub), it is in a permanent state of HEALTH_WARN because of one large omap object: $ ceph health detail HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool '.usage' As far as I can tell, there are two thresholds that can trigger that warning: * The default omap object size warning threshold, osd_deep_scrub_large_omap_object_value_sum_threshold, is 1G. * The default omap object key count warning threshold, osd_deep_scrub_large_omap_object_key_threshold, is 200000. In this case, this was the original situation: osd.6 [WRN] : Large omap object found. Object: 15:169282cd:::usage.20:head Key count: 5834118 Size (bytes): 917351868 So that's 5.8M keys (way above threshold) and 875 MiB total object size (below threshold, but not by much). The usage log in this case was no longer needed that far back, so I trimmed it to keep only the entries from this year (radosgw-admin usage trim --end-date 2018-12-31), a process that took upward of an hour. After the trim (and a deep-scrub of the PG in question¹), my situation looks like this: osd.6 [WRN] Large omap object found. Object: 15:169282cd:::usage.20:head Key count: 1185694 Size (bytes): 187061564 So both the key count and the total object size have diminished by about 80%, which is about what you expect when you trim 5 years of usage log down to 1 year of usage log. However, my key count is still almost 6 times the threshold. I am aware that I can silence the warning by increasing osd_deep_scrub_large_omap_object_key_threshold by a factor of 10, but that's not my question. My question is what I can do to prevent the usage log from creating such large omap objects in the first place. Now, there's something else that you should know about this radosgw, which is that it is configured with the defaults for usage log sharding: rgw_usage_max_shards = 32 rgw_usage_max_user_shards = 1 ... and this cluster's radosgw is pretty much being used by a single application user. So the fact that it's happy to shard the usage log 32 ways is irrelevant as long as it puts the usage log for one user all into one shard. So, I am assuming that if I bump rgw_usage_max_user_shards up to, say, 16 or 32, all *new* usage log entries will be sharded. But I am not aware of any way to reshard the *existing* usage log. Is there such a thing? Otherwise, it seems like the only option in this situation would be to clear the usage log altogether, and tweak the sharding knobs, which should at least make the problem not reappear. Or, else, bump osd_deep_scrub_large_omap_object_key_threshold and just live with the large object. Also, is anyone aware of any adverse side effects of increasing these thresholds, and/or changing the usage log sharding settings, that I should keep in mind here? Thanks in advance for your thoughts. Cheers, Florian ¹For anyone reading this in the archives because they've run into the same problem, and wondering how you find out which PGs in a pool have too-large objects, here's a jq one-liner: ceph --format=json pg ls-by-pool <poolname> \ | jq '.pg_stats[]|select(.stat_sum.num_large_omap_objects>0)' _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx