Re: Significantly increased CPU footprint on OSDs after Hammer -> Jewel upgrade, OSDs occasionally wrongly marked as down

Haomai Wang <haomai@xxxxxxxx> · Wed, 26 Oct 2016 21:36:29 +0800

On Wed, Oct 26, 2016 at 9:09 PM, Trygve Vea
<trygve.vea@xxxxxxxxxxxxxxxxxx> wrote:
>
> ----- Den 26.okt.2016 14:41 skrev Sage Weil sage@xxxxxxxxxxxx:
> > On Wed, 26 Oct 2016, Trygve Vea wrote:
> >> Hi,
> >>
> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
> >> (OpenStack/KVM) pools - and one only for RBD.
> >>
> >> After upgrading both to Jewel, we have seen a significantly increased CPU
> >> footprint on the OSDs that are a part of the cluster which includes RGW.
> >>
> >> This graph illustrates this: http://i.imgur.com/Z81LW5Y.png
> >
> > That looks pretty significant!
> >
> > This doesn't ring any bells--I don't think it's something we've seen.  Can
> > you do a 'perf top -p `pidof ceph-osd`' on one of the OSDs and grab a
> > snapshot of the output?  It would be nice to compare to hammer but I
> > expect you've long since upgraded all of the OSDs...
>
> # perf record -p 18001
> ^C[ perf record: Woken up 57 times to write data ]
> [ perf record: Captured and wrote 18.239 MB perf.data (408850 samples) ]
>
>
> This is a screenshot of one of the osds during high utilization: http://i.imgur.com/031MyIJ.png

hmm, it's hard to know which component behavior. Could you uses top
-Hp [osd pid], and found the most highest threads. Then perf top -t
[osd tid] to see?

To be more clear, top -Hp [pid] result is also helpful to dig, and we
may have multi thread reach high cpu utils. You can perf top -t each
tid and show graph is really wonderful!

>
>
>
> Link to download binary format sent directly to you.
>
>
> Your expectation about upgrades is correct.  We actually had some problems performing the upgrade, so we ended up re-initializing the osds as empty and backfill into jewel.  When we first started them on jewel, they ended up blocking
>
> I want to add that the resource usage isn't flat - this is a day graph of one of the osd servers: http://i.imgur.com/MLfoVgE.png
>
>
>
> Regards
> --
> Trygve Vea
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com