Version: Mimic 13.2.2 Lately during any kind of cluster change, particularly adding OSD in this most recent instance, I'm seeing our mons (all of them) showing 100% usage on a single core but not at all using any of the other available cores on the system. Cluster commands are slow to respond and clients start seeing session timeouts as well. When I turn debug_mon up to 20 the logs are dominated by the message below at a very high rate. Is this indicating an issue? 2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd ef8d3b build_incremental inc f82a1 d4 bytes 2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd ef8d3b reencode_incremental_map f82a0 with features 700088000202a00 2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd ef8d3b build_incremental inc f82a0 f1 bytes 2018-12-19 10:50:21.713 7f356c2cd700 20 mon.wsu-mon01@1(peon).osd ef8d3b reencode_incremental_map f829f with features 700088000202a00 Some searching turned up bug that should be resolved but did seem related: https://tracker.ceph.com/issues/23713 My reading of the tracker leads me to believe re-encoding the map for older clients maybe involved? As such I'll include output of 'ceph features' in case relevant. Depending on what's in most recent CentOS kernel updating/rebooting might be an option for workaround if relevant. Moving to ceph-fuse across all our clients might be an option as well. Any other cluster components not show below are "luminous" "features": "0x3ffddff8ffa4fffb" "client": [ { "features": "0x40107b84a842ada", "release": "jewel", "num": 16 }, { "features": "0x7010fb86aa42ada", "release": "jewel", "num": 75 }, { "features": "0x27018fb86aa42ada", "release": "jewel", "num": 63 }, { "features": "0x3ffddff8ffa4fffb", "release": "luminous", "num": 60 } ], thanks, Ben _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com