Minimum amount of nodes needed for stretch mode?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

TL;DR

Failure domain considered is data center. Cluster in stretch mode [1].

- What is the minimum amount of monitor nodes (apart from tie breaker) needed per failure domain?

- What is the minimum amount of storage nodes needed per failure domain?

- Are device classes supported with stretch mode?

- is min_size = 1 in "degraded stretch mode" a hard coded requirement or can this be changed to it at leave min_size = 2 (yes, I'm aware that no other OSD may go down in the surviving data center or PGs will become unavailable).


I've converted a (test) 3 node replicated cluster (2 storage nodes, 1 node with monitor only, min_size=2, size=4) setup to a "stretch mode" setup [1]. That works as expected.

CRUSH rule (adjusted to work with 1 host and 2 OSDs per device class per data center only)

rule stretch_rule {
	id 5
	type replicated
	step take dc1
	step choose firstn 0 type host
	step chooseleaf firstn 2 type osd
	step emit
	step take dc2
	step choose firstn 0 type host
	step chooseleaf firstn 2 type osd
	step emit
}

The documentation seems to suggest that 2 storage nodes and 2 monitor nodes are needed at a minimum. Is that correct? I wonder why? For a minimal (as possible) cluster I don't see the need for one additional monitor per datacenter. Does the tiebreaker monitor function as a normal monitor (apart from it not allowed to become leader)?

When stretch rules with device classes are used things don't work as expected anymore. Example crush rule:


rule stretch_rule_ssd {
	id 4
	type replicated
	step take dc1 class ssd
	step choose firstn 0 type host
	step chooseleaf firstn 2 type osd
	step emit
	step take dc2 class ssd
	step choose firstn 0 type host
	step chooseleaf firstn 2 type osd
	step emit
}

A similar crush rule for hdd exists. When I change the crush_rule for one of the pools to use stretch_rule_ssd the PGs on OSDs with device class ssd become inactive as soon as one of the data centers goes offline (and "degraded stretched mode" has been activated, and only 1 bucket, data center, is needed for peering). I don't understand why. Another issue with this is that as soon as the datacenter is online again, the recovery will never finish by itself and a "ceph osd force_healthy_stretch_mode --yes-i-really-mean-it" is needed to get HEALTH_OK

Can anyone explain to me why this is?

Gr. Stefan

[1]: https://docs.ceph.com/en/latest/rados/operations/stretch-mode/

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux