Re: High CPU usage by ceph-mgr in 14.2.5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We'd let to verify it the network ping time monitoring feature in
14.2.5 is attributing to this problem.
It'd be great if someone could try
https://tracker.ceph.com/issues/43364#note-3 and let us know.

Thanks,
Neha

On Thu, Dec 19, 2019 at 8:48 AM Mark Nelson <mnelson@xxxxxxxxxx> wrote:
>
> If you can get a wallclock profiler on the mgr process we might be able
> to figure out specifics of what's taking so much time (ie processing
> pg_summary or something else).  Assuming you have gdb with the python
> bindings and the ceph debug packages installed, if you (are anyone)
> could try gdbpmp on the 100% mgr process that would be fantastic.
>
>
> https://github.com/markhpc/gdbpmp
>
>
> gdbpmp.py -p`pidof ceph-mgr` -n 1000 -o mgr.gdbpmp
>
>
> If you want to view the results:
>
>
> gdbpmp.py -i mgr.gdbpmp -t 1
>
>
> Thanks,
>
> Mark
>
>
> On 12/19/19 6:29 AM, Paul Emmerich wrote:
> > We're also seeing unusually high mgr CPU usage on some setups, the
> > only thing they have in common seem to > 300 OSDs.
> >
> > Threads using the CPU are "mgr-fin" and and "ms_dispatch"
> >
> >
> > Paul
> >
> > --
> > Paul Emmerich
> >
> > Looking for help with your Ceph cluster? Contact us at https://croit.io
> >
> > croit GmbH
> > Freseniusstr. 31h
> > 81247 München
> > www.croit.io <http://www.croit.io>
> > Tel: +49 89 1896585 90
> >
> >
> > On Thu, Dec 19, 2019 at 9:40 AM Serkan Çoban <cobanserkan@xxxxxxxxx
> > <mailto:cobanserkan@xxxxxxxxx>> wrote:
> >
> >     +1
> >     1500 OSDs, mgr is constant %100 after upgrading from 14.2.2 to 14.2.5.
> >
> >     On Thu, Dec 19, 2019 at 11:06 AM Toby Darling
> >     <toby@xxxxxxxxxxxxxxxxx <mailto:toby@xxxxxxxxxxxxxxxxx>> wrote:
> >     >
> >     > On 18/12/2019 22:40, Bryan Stillwell wrote:
> >     > > That's how we noticed it too.  Our graphs went silent after
> >     the upgrade
> >     > > completed.  Is your large cluster over 350 OSDs?
> >     >
> >     > A 'me too' on this - graphs have gone quiet, and mgr is using
> >     100% CPU.
> >     > This happened when we grew our 14.2.5 cluster from 328 to 436 OSDs.
> >     >
> >     > Cheers
> >     > Toby
> >     > --
> >     > Toby Darling, Scientific Computing (2N249)
> >     > MRC Laboratory of Molecular Biology
> >     > Francis Crick Avenue
> >     > Cambridge Biomedical Campus
> >     > Cambridge CB2 0QH
> >     > Phone 01223 267070
> >     > _______________________________________________
> >     > ceph-users mailing list -- ceph-users@xxxxxxx
> >     <mailto:ceph-users@xxxxxxx>
> >     > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >     <mailto:ceph-users-leave@xxxxxxx>
> >     _______________________________________________
> >     ceph-users mailing list -- ceph-users@xxxxxxx
> >     <mailto:ceph-users@xxxxxxx>
> >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >     <mailto:ceph-users-leave@xxxxxxx>
> >
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux