Re: High MON cpu usage when cluster is changing

Sage Weil <sweil@xxxxxxxxxx> · Fri, 13 Apr 2018 22:05:56 +0000 (UTC)

On Sat, 14 Apr 2018, Xiaoxi Chen wrote:
> Hi,
> 
>     we are consistently seeing this issue after upgrading to luminous
> from jewel  , the behavior looks like monitor cannot handle
> mon_subscribe from client fast enough, then we see
> 
>     high cpu (1600% +  with simple messenger) for monitor
>     cluster pg state changing slowly as OSDs cannot get latest map fast enough.
>     in some cases like reboot an OSD node( 24 OSDs per node) can cause
> even bigger impact, OSDs even cannot update their auth in time and
> after a while we saw massive OSDs been marked down due to heartbeat
> failure, like
>        2018-04-11 21:19:24.772558 7f6bbb7f5700  0 cephx server
> osd.234:  unexpected key: req.key=690bba2ca98774a2
> expected_key=f63feaae2014a837
> 2018-04-11 21:19:26.539295 7f6bbb7f5700  0 cephx server osd.365:
> unexpected key: req.key=a0eb995e1bef1bf4 expected_key=bafe2e4d55a63478
> 
>    There are a bit more details about the attempts we have made , in
> the ticket  http://tracker.ceph.com/issues/23713.
> 
>    Any suggestion is much appreciated. Thanks.

My guess is that this is the compat reencoding of the OSDMap for the 
pre-luminous clients.

Are you by chance making use of the crush-compat balancer? That would 
additionally require a reencoded crush map.

Can you do a 'perf top -p `pidof ceph-mon`' while this is happening to see 
where the time is being spent?

Thanks!
sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html