Re: [PATCH 0/2] osd: force restart peering when osd is marked down

Kouya Shimura <kouya@xxxxxxxxxxxxxx> · Fri, 1 Jun 2018 14:57:57 +0900

Hi Mykola,

Thank you for your kind advice!

I created a tracker ticket.

http://tracker.ceph.com/issues/24373
https://github.com/ceph/ceph/pull/22358

If I made a mistake, please let me know.

Thanks,
Kouya

Mykola Golub <to.my.trociny@xxxxxxxxx> writes:
> Hi Kouya,
>
> Thank you for reporting and trying to fix this!
>
> Could you please create a tracker ticket [1] to make sure it is not
> lost?
>
> And I think it would be much easier to review your patches (and bring
> the core developers attention) if you created a pull request [2]. And
> if you do this please add the PR link to the tracker ticket.
>
> If you have problems with any of this just let us know, I can do it
> for you.
>
> [1] http://tracker.ceph.com/projects/rados
> [2] https://github.com/ceph/ceph
>
> On Mon, May 28, 2018 at 06:36:18PM +0900, Kouya Shimura wrote:
>> Hi,
>> 
>> I've found that a PG is eternally stuck in 'unfound_recovery' after
>> some OSDs are marked down.
>> 
>> For example, the following steps reproduce this.
>> 
>> 1) Create EC 2+1 pool. Assume a PG has [1,0,2] up/acting set.
>> 2) Execute "ceph osd out osd.0 osd.2". Now the PG has [1,3,5] up/acting set.
>> 3) Put some objects to the PG.
>> 4) Execute "ceph osd in osd.0 osd.2". It starts recovering to [1,0,2].
>> 5) Execute "ceph osd down osd.3 osd.5". (These downs are fake. osd.3
>>    and osd.5 are actually not down)
>>    It leads the PG to transit 'unfound_recovery' and stay on forever.
>> 
>> Interestingly, this bad situation is resolved by mean of marking down
>> another OSD.
>> 
>> 6) Executing "ceph osd down osd.0" (any OSD in acting set is ok) resolves
>>    'unfound_recovery' and restart recovering.
>> 
>> 
>> Upon my investigation, if downed OSD is not a member of current up/acting set,
>> a PG might stay 'ReplicaActive' and discard peering requests from the primary.
>> Thus the primary OSD can't exit from unfound state.
>> PGs of downed OSD should transit to 'Reset' state and start peering.
>> 
>> 
>> I'll post two patches. The first one fixes this issue.
>> The second one is trivial optimization (optional).
>> 
>> Thanks,
>> Kouya
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html