On Thu, Oct 28, 2021 at 8:34 AM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote: > > Hi, > > I have questions about the "stretch mode" feature [1]. > > 1) In the limitations section [2] it is stated that EC pools support > is not implemented yet. Does anyone know what things are missing? I > understand that there should be a restriction on K+M values, I suppose > only profiles with M >= K+2 should be allowed so we would have at > least K+1 shards on every site. I think there were some other incomplete items which prevented me from wanting to support EC in stretch clusters, but unfortunately I can't remember what they were off-hand. It might have just been utter lack of confidence I could correctly set up peering minimum requirements for EC pools, though — replication just needs a peer in each site, but EC needs to guarantee recoverability in each site and that's not a trivial problem in general. You could try and gate peering on the recoverable check function provided by the EC plugin, but I'm not sure if that's the only thing needed. It does not surprise me that things *seem* to work at a basic level, though — I definitely was aiming to make the data structures and logic future-compatible with more than 2 sites or the use of erasure coding! > And for 2+4 profile the rule could be: > > rule stretch_rule { > id 1 > type replicated > step take site1 > step chooseleaf indep 3 type host > step emit > step take site2 > step chooseleaf indep 3 type host > step emit > } > > This is based on the example for a replicated pool provided in the > stretch mode documentation. > > And I just tried, and to make it (apparently) work I just had to > remove the EC pool restrictions in the code. So I wonder what I am > missing. > > 2) Looking at the example for a replicated pool in the doc [1] (which > I used to make a rule for replicated pool): > > rule stretch_rule { > id 1 > type replicated > step take site1 > step chooseleaf firstn 2 type host > step emit > step take site2 > step chooseleaf firstn 2 type host > step emit > } > > With this rule the primary is alway on the site1 (and the same problem > is with my ec rule BTW), which does not look like perfect, i.e. the > reads will always go to osds on site1. Is it a known limitation? The CRUSH rule can be whatever you want it to be, as long as it provides two copies in each site. But keeping the primaries on one side has a lot of utility if you are running it with the expectation of doing live failover in the case of a disaster: if all your live services run in DC1, you can make sure they serve reads out of the local data center, and do not have to send writes in *both* directions. -Greg > > [1] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ > [2] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/#stretch-mode-limitations > > -- > Mykola Golub > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx > _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx