Re: Jewel (10.2.7) osd suicide timeout while deep-scrub

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks, I'll try and do that. Since I'm running a cluster with
multiple nodes, do I have to set this in ceph.conf on all nodes or
does it suffice with just the node with that particular osd?

On 15 August 2017 at 22:51, Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
>
> On Tue, Aug 15, 2017 at 7:03 AM Andreas Calminder
> <andreas.calminder@xxxxxxxxxx> wrote:
>>
>> Hi,
>> I got hit with osd suicide timeouts while deep-scrub runs on a
>> specific pg, there's a RH article
>> (https://access.redhat.com/solutions/2127471) suggesting changing
>> osd_scrub_thread_suicide_timeout' from 60s to a higher value, problem
>> is the article is for Hammer and the osd_scrub_thread_suicide_timeout
>> doesn't exist when running
>> ceph daemon osd.34 config show
>> and the default timeout (60s) suggested in the article doesn't really
>> match the sucide timeout time in the logs:
>>
>> 2017-08-15 15:39:37.512216 7fb293137700  1 heartbeat_map is_healthy
>> 'OSD::osd_op_tp thread 0x7fb231adf700' had suicide timed out after 150
>> 2017-08-15 15:39:37.518543 7fb293137700 -1 common/HeartbeatMap.cc: In
>> function 'bool ceph::HeartbeatMap::_check(const
>> ceph::heartbeat_handle_d*, const char*, time_t)' thread 7fb293137700
>> time 2017-08-15 15:39:37.512230
>> common/HeartbeatMap.cc: 86: FAILED assert(0 == "hit suicide timeout")
>>
>> The suicide timeout (150) does match the
>> osd_op_thread_suicide_timeout, however when I try changing this I get:
>> ceph daemon osd.34 config set osd_op_thread_suicide_timeout 300
>> {
>>     "success": "osd_op_thread_suicide_timeout = '300' (unchangeable) "
>> }
>>
>> And the deep scrub will sucide timeout after 150 seconds, just like
>> before.
>>
>> The cluster is left with osd.34 flapping. Is there any way to let the
>> deep-scrub finish and get out of the infinite deep-scrub loop?
>
>
> You can set that option in ceph.conf. It's "unchangeable" because it's used
> to initialize some other structures at boot so you can't edit it live.
>
>>
>>
>> Regards,
>> Andreas
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux