Re: cosd multi-second stalls cause "wrongly marked me down"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 8, 2011 at 3:11 PM, Jim Schutt <jaschut@xxxxxxxxxx> wrote:
> Sage Weil wrote:
>
>>
>> I would also be interested in seeing a system level profile (oprofile?) to
>> see where CPU time is being spent.  There are likely low hanging fruit in
>> the OSD that would reduce CPU overhead.
>
> This will take me a little while, since I need to learn
> about the tools.  But since I need to learn about them
> anyway, that's a good thing.

oprofile is surprisingly easy to get started with. We have a wiki page about it:

http://ceph.newdream.net/wiki/Cpu_profiling

>
>>
>> I guess the other thing that would help to confirm this is to just halve
>> the number of OSDs on your machines in a test and see if the problem goes
>> away.
>
> I was going to try this first, exactly because it seems like
> a definitive test.
>
>>
>>> If my analysis above is correct, do you think anything
>>> can be gained by running the heartbeat and heartbeat
>>> dispatcher threads as SCHED_RR threads?  Since tick() runs
>>> heartbeat_check(), that would also need to be SCHED_RR,
>>> or the heartbeats could arrive on time, but not checked
>>> until it was too late.

Thanks for the ideas. However, I doubt that making the OSD::tick()
thread SCHED_RR would really work.

The OSD::tick() code is taking locks all over the place. Since a bunch
of other threads besides the tick thread can be holding those locks,
this would soon result in priority inversion. Not to mention,
heartbeat_messenger has its own thread(s) which actually perform the
work of sending the heartbeat messages.

cheers,
Colin
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux