Thank you, Paul.
The thresholds were recently reduced by a factor of 10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?
We'll consider adding more MDS servers, although the workload hasn't
been an issue yet.
Or increase the thresholds, I wouldn't worry at all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.
We haven't had any issues with MDS failovers and/or recovery yet, I
guess higher thresholds would be fine.
To get rid of the warning (for a week) it was sufficient to issue a
deep-scrub on the affected PG while the listomapkeys output was lower
than 200k. Maybe we were just "lucky" until now because the
deep-scrubs are issued outside of business hours, so the number of
open files should be lower.
Anyway, thank you for your input, it seems as if this is not a problem
at the moment.
Regards,
Eugen
Zitat von Paul Emmerich <paul.emmerich@xxxxxxxx>:
The thresholds were recently reduced by a factor of 10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?
Or increase the thresholds, I wouldn't worry at all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Tue, Oct 1, 2019 at 9:10 AM Eugen Block <eblock@xxxxxx> wrote:
Hi all,
we have a new issue in our Nautilus cluster.
The large omap warning seems to be more common for RGW usage, but we
currently only use CephFS and RBD. I found one thread [1] regarding
metadata pool, but it doesn't really help in our case.
The deep-scrub of PG 36.6 brought up this message (deep-scrub finished
with "ok"):
2019-09-30 20:18:22.548401 osd.9 (osd.9) 275 : cluster [WRN] Large
omap object found. Object: 36:654134d2:::mds0_openfiles.0:head Key
count: 238621 Size (bytes): 9994510
I checked xattr (none) and omapheader:
ceph01:~ # rados -p cephfs-metadata listxattr mds0_openfiles.0
ceph01:~ # rados -p cephfs-metadata getomapheader mds0_openfiles.0
header (42 bytes) :
00000000 13 00 00 00 63 65 70 68 20 66 73 20 76 6f 6c 75
|....ceph fs volu|
00000010 6d 65 20 76 30 31 31 01 01 0d 00 00 00 74 c3 12 |me
v011......t..|
00000020 00 00 00 00 00 01 00 00 00 00 |..........|
0000002a
ceph01:~ # ceph fs volume ls
[
{
"name": "cephfs"
}
]
The respective OSD has default thresholds regarding large_omap:
ceph02:~ # ceph daemon osd.9 config show | grep large_omap
"osd_deep_scrub_large_omap_object_key_threshold": "200000",
"osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824",
Can anyone point me to a solution for this?
Best regards,
Eugen
[1]
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033813.html
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx