ceph-mon high single-core usage, reencode_incremental_map

Benjeman Meekhof <bmeekhof@xxxxxxxxx> · Wed, 19 Dec 2018 11:09:32 -0500

Version:  Mimic 13.2.2

Lately during any kind of cluster change, particularly adding OSD in
this most recent instance, I'm seeing our mons (all of them) showing
100% usage on a single core but not at all using any of the other
available cores on the system.  Cluster commands are slow to respond
and clients start seeing session timeouts as well.

When I turn debug_mon up to 20 the logs are dominated by the message
below at a very high rate.  Is this indicating an issue?

2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd
ef8d3b build_incremental    inc f82a1 d4 bytes
2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd
ef8d3b reencode_incremental_map f82a0 with features 700088000202a00
2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd
ef8d3b build_incremental    inc f82a0 f1 bytes
2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd
ef8d3b reencode_incremental_map f829f with features 700088000202a00

Some searching turned up bug that should be resolved but did seem related:
https://tracker.ceph.com/issues/23713

My reading of the tracker leads me to believe re-encoding the map for
older clients maybe involved?  As such I'll include output of 'ceph
features' in case relevant.  Depending on what's in most recent CentOS
kernel updating/rebooting might be an option for workaround if
relevant.  Moving to ceph-fuse across all our clients might be an
option as well.  Any other cluster components not show below are
"luminous"  "features": "0x3ffddff8ffa4fffb"

"client": [
        {
            "features": "0x40107b84a842ada",
            "release": "jewel",
            "num": 16
        },
        {
            "features": "0x7010fb86aa42ada",
            "release": "jewel",
            "num": 75
        },
        {
            "features": "0x27018fb86aa42ada",
            "release": "jewel",
            "num": 63
        },
        {
            "features": "0x3ffddff8ffa4fffb",
            "release": "luminous",
            "num": 60
        }
    ],

thanks,
Ben
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com