Re: CEPH failure domain - power considerations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Phil;

I like to refer to basic principles, and design assumptions / choices when considering things like this.  I also like to refer to more broadly understood technologies.  Finally; I'm still relatively new to Ceph, so here it goes...

TLDR: Ceph is (likes to be) double-redundent (like RAID-6), while dual power (n+1) is single-redundant.

Like RAID, Ceph (or more precisely a Ceph pool) can be in, and moves through, the following states:

Normal --> Partially Failed (degraded) --> Recovering --> Normal.

When talking about these systems, we often gloss over Recovery, acting as if it takes no time.  Recovery does take time though, and if anything ELSE happens while recovery is ongoing, what can the software do?

Think RAID-5; what happens if a drive fails in a RAID-5 array, and during recovery an unreadable block is found on another drive?  That's single redundancy.  If you use RAID-6, the array goes to the second redundancy level, and the recovery continues.

As a result of the long recovery times expected of modern large hard-drives, Ceph pushes for double-redundancy (3x replication, 5-2 EC).  Further, it decreases availability the more redundancy is degraded (i.e. when the first layer of redundancy is compromised, writes are still allowed.  When the second is lost, writes are disallowed, but reads are allowed.  Only when all three layers are compromised are reads disallowed).

Dual power feeds (n+1) is only single-redundant, thus the entire system can't achieve better than single-redundancy.  Depending on the reliability of the power, and your service guarantees, this may be acceptable.

If you add ATSs, then you need to look at the failure rate (MTBF, or similar) to determine if your service guarantees are impacted.

Dominic L. Hilsbos, MBA 
Director – Information Technology 
Perform Air International Inc.
DHilsbos@xxxxxxxxxxxxxx 
www.PerformAir.com


-----Original Message-----
From: Phil Regnauld [mailto:pr@xxxxx] 
Sent: Friday, May 29, 2020 12:59 AM
To: Hans van den Bogert
Cc: ceph-users@xxxxxxx
Subject:  Re: CEPH failure domain - power considerations

Hans van den Bogert (hansbogert) writes:
> I would second that, there's no winning in this case for your 
> requirements and single PSU nodes. If there were 3 feeds,  then yes; 
> you could make an extra layer in your crushmap much like you would 
> incorporate a rack topology in the crushmap.

	I'm not fully up on coffee for today, so I haven't yet worked out why
	3 feeds would help ? To have a 'tie breaker' of sorts, with hosts spread
	across 3 rails ?
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux