Hi Gregory, Thank you for clarifying this. I am going to research this further and hope be back with some results or a PR. Thanks, -- Mykola Golub On Thu, Oct 28, 2021 at 01:30:16PM -0700, Gregory Farnum wrote: > On Thu, Oct 28, 2021 at 8:34 AM Mykola Golub <to.my.trociny@xxxxxxxxx> wrote: > > > > Hi, > > > > I have questions about the "stretch mode" feature [1]. > > > > 1) In the limitations section [2] it is stated that EC pools support > > is not implemented yet. Does anyone know what things are missing? I > > understand that there should be a restriction on K+M values, I suppose > > only profiles with M >= K+2 should be allowed so we would have at > > least K+1 shards on every site. > > I think there were some other incomplete items which prevented me from > wanting to support EC in stretch clusters, but unfortunately I can't > remember what they were off-hand. > It might have just been utter lack of confidence I could correctly set > up peering minimum requirements for EC pools, though — replication > just needs a peer in each site, but EC needs to guarantee > recoverability in each site and that's not a trivial problem in > general. You could try and gate peering on the recoverable check > function provided by the EC plugin, but I'm not sure if that's the > only thing needed. > > It does not surprise me that things *seem* to work at a basic level, > though — I definitely was aiming to make the data structures and logic > future-compatible with more than 2 sites or the use of erasure coding! > > > > > And for 2+4 profile the rule could be: > > > > rule stretch_rule { > > id 1 > > type replicated > > step take site1 > > step chooseleaf indep 3 type host > > step emit > > step take site2 > > step chooseleaf indep 3 type host > > step emit > > } > > > > This is based on the example for a replicated pool provided in the > > stretch mode documentation. > > > > And I just tried, and to make it (apparently) work I just had to > > remove the EC pool restrictions in the code. So I wonder what I am > > missing. > > > > 2) Looking at the example for a replicated pool in the doc [1] (which > > I used to make a rule for replicated pool): > > > > rule stretch_rule { > > id 1 > > type replicated > > step take site1 > > step chooseleaf firstn 2 type host > > step emit > > step take site2 > > step chooseleaf firstn 2 type host > > step emit > > } > > > > With this rule the primary is alway on the site1 (and the same problem > > is with my ec rule BTW), which does not look like perfect, i.e. the > > reads will always go to osds on site1. Is it a known limitation? > > The CRUSH rule can be whatever you want it to be, as long as it > provides two copies in each site. But keeping the primaries on one > side has a lot of utility if you are running it with the expectation > of doing live failover in the case of a disaster: if all your live > services run in DC1, you can make sure they serve reads out of the > local data center, and do not have to send writes in *both* > directions. > -Greg > > > > > [1] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/ > > [2] https://docs.ceph.com/en/latest/rados/operations/stretch-mode/#stretch-mode-limitations > > > > -- > > Mykola Golub > > _______________________________________________ > > Dev mailing list -- dev@xxxxxxx > > To unsubscribe send an email to dev-leave@xxxxxxx > > > -- Mykola Golub _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx