Re: Significantly increased CPU footprint on OSDs after Hammer -> Jewel upgrade, OSDs occasionally wrongly marked as down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 26, 2016 at 9:57 PM, Trygve Vea
<trygve.vea@xxxxxxxxxxxxxxxxxx> wrote:
> ----- Den 26.okt.2016 15:36 skrev Haomai Wang haomai@xxxxxxxx:
>> On Wed, Oct 26, 2016 at 9:09 PM, Trygve Vea
>> <trygve.vea@xxxxxxxxxxxxxxxxxx> wrote:
>>>
>>> ----- Den 26.okt.2016 14:41 skrev Sage Weil sage@xxxxxxxxxxxx:
>>> > On Wed, 26 Oct 2016, Trygve Vea wrote:
>>> >> Hi,
>>> >>
>>> >> We have two Ceph-clusters, one exposing pools both for RGW and RBD
>>> >> (OpenStack/KVM) pools - and one only for RBD.
>>> >>
>>> >> After upgrading both to Jewel, we have seen a significantly increased CPU
>>> >> footprint on the OSDs that are a part of the cluster which includes RGW.
>>> >>
>>> >> This graph illustrates this: http://i.imgur.com/Z81LW5Y.png
>>> >
>>> > That looks pretty significant!
>>> >
>>> > This doesn't ring any bells--I don't think it's something we've seen.  Can
>>> > you do a 'perf top -p `pidof ceph-osd`' on one of the OSDs and grab a
>>> > snapshot of the output?  It would be nice to compare to hammer but I
>>> > expect you've long since upgraded all of the OSDs...
>>>
>>> # perf record -p 18001
>>> ^C[ perf record: Woken up 57 times to write data ]
>>> [ perf record: Captured and wrote 18.239 MB perf.data (408850 samples) ]
>>>
>>>
>>> This is a screenshot of one of the osds during high utilization:
>>> http://i.imgur.com/031MyIJ.png
>>
>>
>> hmm, it's hard to know which component behavior. Could you uses top
>> -Hp [osd pid], and found the most highest threads. Then perf top -t
>> [osd tid] to see?
>
> http://i.imgur.com/Bxny5bd.png

from this graph, it looks like filestore thread do LFNIndex things
causing cpu util.

>
>
>> To be more clear, top -Hp [pid] result is also helpful to dig, and we
>> may have multi thread reach high cpu utils. You can perf top -t each
>> tid and show graph is really wonderful!
>
> In this case I saw two threads with high utilization, I peeked at both and the active symbols looked pretty  much the same for those two.
>
> show graph?
>
>
> Regards
> --
> Trygve Vea
> Redpill Linpro AS
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux