So, is it okay to say that compared to the 'firstn' mode, the 'indep' mode may have the least impact on a cluster in an event of OSD failure? Could I use 'indep' for replica pool as well? Thank you! Regards, Cody On Wed, Aug 22, 2018 at 7:12 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > On Wed, Aug 22, 2018 at 12:56 AM Konstantin Shalygin <k0ste@xxxxxxxx> wrote: >> >> > Hi everyone, >> > >> > I read an earlier thread [1] that made a good explanation on the 'step >> > choose|chooseleaf' option. Could someone further help me to understand >> > the 'firstn|indep' part? Also, what is the relationship between 'step >> > take' and 'step choose|chooseleaf' when it comes to define a failure >> > domain? >> > >> > Thank you very much. >> >> >> This documented on CRUSH Map Rules [1] >> >> >> [1] >> http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-map-rules >> > > But that doesn't seem to really discuss it, and I don't see it elsewhere in our docs either. So: > > "indep" and "firstn" are two different strategies for selecting items (mostly, OSDs) in a CRUSH hierarchy. If you're storing EC data you want to use indep; if you're storing replicated data you want to use firstn. > > The reason has to do with how they behave when a previously-selected devices fails. Let's say you have a PG stored on OSDs 1, 2, 3, 4, 5. Then 3 goes down. > With the "firstn" mode, CRUSH simply adjusts its calculation in a way that it selects 1 and 2, then selects 3 but discovers it's down, so it retries and selects 4 and 5, and then goes on to select a new OSD 6. So the final CRUSH mapping change is > 1, 2, 3, 4, 5 -> 1, 2, 4, 5, 6. > > But if you're storing an EC pool, that means you just changed the data mapped to OSDs 4, 5, and 6! That's terrible! So the "indep" mode attempts to not do that. (It still *might* conflict, but the odds are much lower). You can instead expect it, when it selects the failed 3, to try again and pick out 6, for a final transformation of: > 1, 2, 3, 4, 5 -> 1, 2, 6, 4, 5 > -Greg > >> >> >> >> k >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com