[PATCH 0/2] osd: force restart peering when osd is marked down

Kouya Shimura <kouya@xxxxxxxxxxxxxx> · Mon, 28 May 2018 18:36:18 +0900

Hi,

I've found that a PG is eternally stuck in 'unfound_recovery' after
some OSDs are marked down.

For example, the following steps reproduce this.

1) Create EC 2+1 pool. Assume a PG has [1,0,2] up/acting set.
2) Execute "ceph osd out osd.0 osd.2". Now the PG has [1,3,5] up/acting set.
3) Put some objects to the PG.
4) Execute "ceph osd in osd.0 osd.2". It starts recovering to [1,0,2].
5) Execute "ceph osd down osd.3 osd.5". (These downs are fake. osd.3
   and osd.5 are actually not down)
   It leads the PG to transit 'unfound_recovery' and stay on forever.

Interestingly, this bad situation is resolved by mean of marking down
another OSD.

6) Executing "ceph osd down osd.0" (any OSD in acting set is ok) resolves
   'unfound_recovery' and restart recovering.

Upon my investigation, if downed OSD is not a member of current up/acting set,
a PG might stay 'ReplicaActive' and discard peering requests from the primary.
Thus the primary OSD can't exit from unfound state.
PGs of downed OSD should transit to 'Reset' state and start peering.

I'll post two patches. The first one fixes this issue.
The second one is trivial optimization (optional).

Thanks,
Kouya

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html