Hi David, On 28/10/2019 20:44, David Monschein wrote: > Hi All, > > Running an object storage cluster, originally deployed with Nautilus > 14.2.1 and now running 14.2.4. > > Last week I was alerted to a new warning from my object storage cluster: > > [root@ceph1 ~]# ceph health detail > HEALTH_WARN 1 large omap objects > LARGE_OMAP_OBJECTS 1 large omap objects > 1 large objects found in pool 'default.rgw.log' > Search the cluster log for 'Large omap object found' for more details. > > I looked into this and found the object and pool in question > (default.rgw.log): > > [root@ceph1 /var/log/ceph]# grep -R -i 'Large omap object found' . > ./ceph.log:2019-10-24 12:21:26.984802 osd.194 (osd.194) 715 : cluster > [WRN] Large omap object found. Object: 5:0fbdcb32:usage::usage.17:head > Key count: 702330 Size (bytes): 92881228 > > [root@ceph1 ~]# ceph --format=json pg ls-by-pool default.rgw.log | jq '.[]' | egrep '(pgid|num_large_omap_objects)' | grep -v '"num_large_omap_objects": 0,' | grep -B1 num_large_omap_objects > "pgid": "5.70", > "num_large_omap_objects": 1, > While I was investigating, I noticed an enormous amount of entries in > the RGW usage log: > > [root@ceph ~]# radosgw-admin usage show | grep -c bucket > 223326 > [...] I recently ran into a similar issue: https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/AQNGVY7VJ3K6ZGRSTX3E5XIY7DBNPDHW/ You have 702,330 keys on that omap object, so you would have been bitten by the default for osd_deep_scrub_large_omap_object_key_threshold having been revised down from 2,000,000 to 200,000 in 14.2.3: https://github.com/ceph/ceph/commit/d8180c57ac9083f414a23fd393497b2784377735 https://tracker.ceph.com/issues/40583 That's why you didn't see this warning before your recent upgrade. > There are entries for over 223k buckets! This was pretty scary to see, > considering we only have maybe 500 legitimate buckets in this fairly new > cluster. Almost all of the entries in the usage log are bogus entries > from anonymous users. It looks like someone/something was scanning, > looking for vulnerabilities, etc. Here are a few example entries, notice > none of the operations were successful: Caveat: whether or not you really *want* to trim the usage log is up to you to decide. If you are suspecting you are dealing with a security breach, you should definitely export and preserve the usage log before you trim it, or else delay trimming until you have properly investigated your problem. *If* you decide you no longer need those usage log entries, you can use "radosgw-admin usage trim" with appropriate --start-date, --end-date, and/or --uid options, to clean them up: https://docs.ceph.com/docs/nautilus/radosgw/admin/#trim-usage Please let me know if that information is helpful. Thank you! Cheers, Florian _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx