Re: CephFS metadata: Large omap object found

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you, Paul.

The thresholds were recently reduced by a factor of 10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?

We'll consider adding more MDS servers, although the workload hasn't been an issue yet.

Or increase the thresholds, I wouldn't worry at all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.


We haven't had any issues with MDS failovers and/or recovery yet, I guess higher thresholds would be fine. To get rid of the warning (for a week) it was sufficient to issue a deep-scrub on the affected PG while the listomapkeys output was lower than 200k. Maybe we were just "lucky" until now because the deep-scrubs are issued outside of business hours, so the number of open files should be lower.

Anyway, thank you for your input, it seems as if this is not a problem at the moment.

Regards,
Eugen


Zitat von Paul Emmerich <paul.emmerich@xxxxxxxx>:

The thresholds were recently reduced by a factor of 10. I guess you
have a lot of (open) files? Maybe use more active MDS servers?

Or increase the thresholds, I wouldn't worry at all about 200k omap
keys if you are running on reasonable hardware.
The usual argument for a low number of omap keys is recovery time, but
if you are running a metadata-heavy workload on something that has
problems recovering 200k keys in less than a few seconds, then you are
doing something wrong anyways.


Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Oct 1, 2019 at 9:10 AM Eugen Block <eblock@xxxxxx> wrote:

Hi all,

we have a new issue in our Nautilus cluster.
The large omap warning seems to be more common for RGW usage, but we
currently only use CephFS and RBD. I found one thread [1] regarding
metadata pool, but it doesn't really help in our case.

The deep-scrub of PG 36.6 brought up this message (deep-scrub finished
with "ok"):

2019-09-30 20:18:22.548401 osd.9 (osd.9) 275 : cluster [WRN] Large
omap object found. Object: 36:654134d2:::mds0_openfiles.0:head Key
count: 238621 Size (bytes): 9994510


I checked xattr (none) and omapheader:

ceph01:~ # rados -p cephfs-metadata listxattr mds0_openfiles.0
ceph01:~ # rados -p cephfs-metadata getomapheader mds0_openfiles.0
header (42 bytes) :
00000000 13 00 00 00 63 65 70 68 20 66 73 20 76 6f 6c 75 |....ceph fs volu| 00000010 6d 65 20 76 30 31 31 01 01 0d 00 00 00 74 c3 12 |me v011......t..|
00000020  00 00 00 00 00 01 00 00  00 00                    |..........|
0000002a

ceph01:~ # ceph fs volume ls
[
   {
     "name": "cephfs"
   }
]


The respective OSD has default thresholds regarding large_omap:

ceph02:~ # ceph daemon osd.9 config show | grep large_omap
     "osd_deep_scrub_large_omap_object_key_threshold": "200000",
     "osd_deep_scrub_large_omap_object_value_sum_threshold": "1073741824",


Can anyone point me to a solution for this?

Best regards,
Eugen


[1] http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033813.html
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux