Re: Significantly increased CPU footprint on OSDs after Hammer -> Jewel upgrade, OSDs occasionally wrongly marked as down

Haomai Wang <haomai@xxxxxxxx> · Thu, 27 Oct 2016 03:25:14 +0800

On Thu, Oct 27, 2016 at 2:10 AM, Trygve Vea
<trygve.vea@xxxxxxxxxxxxxxxxxx> wrote:
> ----- Den 26.okt.2016 16:37 skrev Sage Weil sage@xxxxxxxxxxxx:
>> On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> ----- Den 26.okt.2016 14:41 skrev Sage Weil sage@xxxxxxxxxxxx:
>>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> >> Hi,
>>> >>
>>> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
>>> >> (OpenStack/KVM) pools - and one only for RBD.
>>> >>
>>> >> After upgrading both to Jewel, we have seen a significantly increased CPU
>>> >> footprint on the OSDs that are a part of the cluster which includes RGW.
>>> >>
>>> >> This graph illustrates this: http://i.imgur.com/Z81LW5Y.png
>>> >
>>> > That looks pretty significant!
>>> >
>>> > This doesn't ring any bells--I don't think it's something we've seen.  Can
>>> > you do a 'perf top -p `pidof ceph-osd`' on one of the OSDs and grab a
>>> > snapshot of the output?  It would be nice to compare to hammer but I
>>> > expect you've long since upgraded all of the OSDs...
>>>
>>> # perf record -p 18001
>>> ^C[ perf record: Woken up 57 times to write data ]
>>> [ perf record: Captured and wrote 18.239 MB perf.data (408850 samples) ]
>>>
>>>
>>> This is a screenshot of one of the osds during high utilization:
>>> http://i.imgur.com/031MyIJ.png
>>
>> It looks like a ton of time spent in std::string methods and a lot more
>> map<sring,ghobject_t> than I would expect.  Can you do a
>>
>> perf record -p `pidof ceph-osd` -g
>> perf report --stdout
>
> Here you go:
>
> http://employee.tv.situla.bitbit.net/stdio-report.gz?AWSAccessKeyId=V4NZ37SLP3VOPR2BI5UW&Expires=1477579744&Signature=pt8CvsaVHhYCtJ1kUfRsKq4MY7k%3D

I can't perf report this locally, anyone can? or maybe you can give a
screenshot about perf report of this?

>
>
>>> Link to download binary format sent directly to you.
>>>
>>>
>>> Your expectation about upgrades is correct.  We actually had some
>>> problems performing the upgrade, so we ended up re-initializing the osds
>>> as empty and backfill into jewel.  When we first started them on jewel,
>>> they ended up blocking
>>
>> Hrm, this is a new one for me too.  They've all been upgraded now?  It
>> would be nice to see a log or backtrace to see why they got stuck.
>
> Sorry, I cannot provide this information anymore :(
>
>
> --
> Trygve
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com