Re: [PATCH 0/2] osd: force restart peering when osd is marked down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Kouya,

Thank you for reporting and trying to fix this!

Could you please create a tracker ticket [1] to make sure it is not
lost?

And I think it would be much easier to review your patches (and bring
the core developers attention) if you created a pull request [2]. And
if you do this please add the PR link to the tracker ticket.

If you have problems with any of this just let us know, I can do it
for you.

[1] http://tracker.ceph.com/projects/rados
[2] https://github.com/ceph/ceph

On Mon, May 28, 2018 at 06:36:18PM +0900, Kouya Shimura wrote:
> Hi,
> 
> I've found that a PG is eternally stuck in 'unfound_recovery' after
> some OSDs are marked down.
> 
> For example, the following steps reproduce this.
> 
> 1) Create EC 2+1 pool. Assume a PG has [1,0,2] up/acting set.
> 2) Execute "ceph osd out osd.0 osd.2". Now the PG has [1,3,5] up/acting set.
> 3) Put some objects to the PG.
> 4) Execute "ceph osd in osd.0 osd.2". It starts recovering to [1,0,2].
> 5) Execute "ceph osd down osd.3 osd.5". (These downs are fake. osd.3
>    and osd.5 are actually not down)
>    It leads the PG to transit 'unfound_recovery' and stay on forever.
> 
> Interestingly, this bad situation is resolved by mean of marking down
> another OSD.
> 
> 6) Executing "ceph osd down osd.0" (any OSD in acting set is ok) resolves
>    'unfound_recovery' and restart recovering.
> 
> 
> Upon my investigation, if downed OSD is not a member of current up/acting set,
> a PG might stay 'ReplicaActive' and discard peering requests from the primary.
> Thus the primary OSD can't exit from unfound state.
> PGs of downed OSD should transit to 'Reset' state and start peering.
> 
> 
> I'll post two patches. The first one fixes this issue.
> The second one is trivial optimization (optional).
> 
> Thanks,
> Kouya
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Mykola Golub
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux