Re: Significantly increased CPU footprint on OSDs after Hammer -> Jewel upgrade, OSDs occasionally wrongly marked as down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- Den 26.okt.2016 21:25 skrev Haomai Wang haomai@xxxxxxxx:
> On Thu, Oct 27, 2016 at 2:10 AM, Trygve Vea
> <trygve.vea@xxxxxxxxxxxxxxxxxx> wrote:
>> ----- Den 26.okt.2016 16:37 skrev Sage Weil sage@xxxxxxxxxxxx:
>>> On Wed, 26 Oct 2016, Trygve Vea wrote:
>>>> ----- Den 26.okt.2016 14:41 skrev Sage Weil sage@xxxxxxxxxxxx:
>>>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>>>> >> Hi,
>>>> >>
>>>> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
>>>> >> (OpenStack/KVM) pools - and one only for RBD.
>>>> >>
>>>> >> After upgrading both to Jewel, we have seen a significantly increased CPU
>>>> >> footprint on the OSDs that are a part of the cluster which includes RGW.
>>>> >>
>>>> >> This graph illustrates this: http://i.imgur.com/Z81LW5Y.png
>>>> >
>>>> > That looks pretty significant!
>>>> >
>>>> > This doesn't ring any bells--I don't think it's something we've seen.  Can
>>>> > you do a 'perf top -p `pidof ceph-osd`' on one of the OSDs and grab a
>>>> > snapshot of the output?  It would be nice to compare to hammer but I
>>>> > expect you've long since upgraded all of the OSDs...
>>>>
>>>> # perf record -p 18001
>>>> ^C[ perf record: Woken up 57 times to write data ]
>>>> [ perf record: Captured and wrote 18.239 MB perf.data (408850 samples) ]
>>>>
>>>>
>>>> This is a screenshot of one of the osds during high utilization:
>>>> http://i.imgur.com/031MyIJ.png
>>>
>>> It looks like a ton of time spent in std::string methods and a lot more
>>> map<sring,ghobject_t> than I would expect.  Can you do a
>>>
>>> perf record -p `pidof ceph-osd` -g
>>> perf report --stdout
>>
>> Here you go:
>>
>> http://employee.tv.situla.bitbit.net/stdio-report.gz?AWSAccessKeyId=V4NZ37SLP3VOPR2BI5UW&Expires=1477579744&Signature=pt8CvsaVHhYCtJ1kUfRsKq4MY7k%3D
> 
> I can't perf report this locally, anyone can? or maybe you can give a
> screenshot about perf report of this?

The file contains the output of 'perf report --stdio' gzipped, so it should be viewable with zless.

A screenshot of 'perf report' based on the same perf.data-file is available here: http://i.imgur.com/v6jJhZd.png


-- 
Trygve
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux