Re: Question about 'firstn|indep'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



So, is it okay to say that compared to the 'firstn' mode, the 'indep'
mode may have the least impact on a cluster in an event of OSD
failure? Could I use 'indep' for replica pool as well?

Thank you!

Regards,
Cody
On Wed, Aug 22, 2018 at 7:12 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
>
> On Wed, Aug 22, 2018 at 12:56 AM Konstantin Shalygin <k0ste@xxxxxxxx> wrote:
>>
>> > Hi everyone,
>> >
>> > I read an earlier thread [1] that made a good explanation on the 'step
>> > choose|chooseleaf' option. Could someone further help me to understand
>> > the 'firstn|indep' part? Also, what is the relationship between 'step
>> > take' and 'step choose|chooseleaf' when it comes to define a failure
>> > domain?
>> >
>> > Thank you very much.
>>
>>
>> This documented on CRUSH Map Rules [1]
>>
>>
>> [1]
>> http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-map-rules
>>
>
> But that doesn't seem to really discuss it, and I don't see it elsewhere in our docs either. So:
>
> "indep" and "firstn" are two different strategies for selecting items (mostly, OSDs) in a CRUSH hierarchy. If you're storing EC data you want to use indep; if you're storing replicated data you want to use firstn.
>
> The reason has to do with how they behave when a previously-selected devices fails. Let's say you have a PG stored on OSDs 1, 2, 3, 4, 5. Then 3 goes down.
> With the "firstn" mode, CRUSH simply adjusts its calculation in a way that it selects 1 and 2, then selects 3 but discovers it's down, so it retries and selects 4 and 5, and then goes on to select a new OSD 6. So the final CRUSH mapping change is
> 1, 2, 3, 4, 5 -> 1, 2, 4, 5, 6.
>
> But if you're storing an EC pool, that means you just changed the data mapped to OSDs 4, 5, and 6! That's terrible! So the "indep" mode attempts to not do that. (It still *might* conflict, but the odds are much lower). You can instead expect it, when it selects the failed 3, to try again and pick out 6, for a final transformation of:
> 1, 2, 3, 4, 5 -> 1, 2, 6, 4, 5
> -Greg
>
>>
>>
>>
>> k
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux