Re: OSDs marked OUT wrongly after monitor failover

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



we can easily reproduce:
1. stop osd.0;
2. start osd.0;
3. wait mon_osd_down_out_interval time, eg:300s;
4. stop osd.0;
5. stop monA(leader);
6. monB won the election, becomes the leader, and then osd.0 is marked out.

2016-10-27 10:48 GMT+08:00 Ridge Chen <ridge.chen@xxxxxxxxx>:
> I also raised a bug report: http://tracker.ceph.com/issues/17719
>
> 2016-10-27 10:36 GMT+08:00 Ridge Chen <ridge.chen@xxxxxxxxx>:
>> Hi Experts,
>>
>> Recently we find an issue with our ceph cluster, the version is 0.94.6.
>>
>> We want to add additional RAM to the ceph nodes, so we need to stop
>> the ceph service on the nodes first. When we did that on the first
>> node, we found the OSDs on that node marked OUT and backfill started
>> (DOWN is expected in this case). The first node is somewhat special
>> that it is also the location of the leader monitor.
>>
>> Then checked the monitor log and found the following:
>>
>> cluster [INF] osd.0 out (down for 3375169.141844)
>>
>> Looks like the monitor (who just become leader) has wrong
>> "down_pending_out" records and computes out a  a very long DOWN time ,
>> finally decides to mark them OUT.
>>
>> After researching the related code, the reason could be that:
>>
>> 1. "down_pending_out" is set a month ago for those OSDs because of a
>> network issue.
>> 2. The down OSDs up and join the cluster again. "down_pending_out" is
>> cleared in the "OSDMonitor::tick()" method. But only happened on
>> leader monitor.
>> 3. When we stop the ceph service on the first node. The monitor group
>> failover. The new leader monitor will recognize the OSDs kept in DOWN
>> status for a a very long time, and mark them OUT wrongly.
>>
>>
>> What do you think of this?
>>
>> Regards
>> Ridge
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux