Re: Is there any way to obtain the maximum number of node failure in ceph without data loss?

Josh Baergen <jbaergen@xxxxxxxxxxxxxxxx> · Fri, 23 Jul 2021 08:40:16 -0600

Hi Jerry,

In general, your CRUSH rules should define the behaviour you're
looking for. Based on what you've stated about your configuration,
after failing a single node or an OSD on a single node, then you
should still be able to tolerate two more failures in the system
without losing data (or losing access to data, given that min_size=k,
though I believe it's recommended to set min_size=k+1).

However, that sequence of acting sets doesn't make a whole lot of
sense to me for a single OSD failure (though perhaps I'm misreading
them). Can you clarify exactly how you simulated the osd.14 failure?
It might also be helpful to post your CRUSH rule and "ceph osd tree".

Josh

On Fri, Jul 23, 2021 at 1:42 AM Jerry Lee <leisurelysw24@xxxxxxxxx> wrote:
>
> Hello,
>
> I would like to know the maximum number of node failures for a EC8+3
> pool in a 12-node cluster with 3 OSDs in each node.  The size and
> min_size of the EC8+3 pool is configured as 11 and 8, and OSDs of each
> PG are selected by host.  When there is no node failure, the maximum
> number of node failures is 3, right?  After unplugging a OSD (osd.14)
> in the cluster, I check the PG acting set changes and one of the
> results is shown as below:
>
> T0:
> [15,31,11,34,28,1,8,26,14,19,5]
>
> T1: after unplugging a OSD (osd.14) and recovery started
> [15,31,11,34,28,1,8,26,NONE,19,5]
>
> T2:
> [15,31,11,34,21,1,8,26,19,29,5]
>
> T3:
> [15,31,11,34,NONE,1,8,26,NONE,NONE,5]
>
> T4: recovery was done
> [15,31,11,34,21,1,8,26,19,29,5]
>
> For the PG, 3 OSD peers changed during the recovery progress
> ([_,_,_,_,28->21,_,_,_,14->19,19->29,_]).  It seems that min_size (8)
> of chunks of the EC8+3 pool are kept during recovery.  Does it mean
> that no more node failures are bearable during T3 to T4?  Can we
> calculate the maximum number of node failures by examining all the
> acting sets of the PGs?  Is there some simple way to obtain such
> information?  Any ideas and feedback are appreciated, thanks!
>
> - Jerry
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx