Re: look into erasure coding

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 07/11/2014 15:37, eric mourgaya wrote:> Hi,
> 
> "  In erasure coding pool, how do we know which OSDs keeping the data chunk and which one the keep the encoding chunk?"
> 
> There was this question yesterday on  ceph irc channel on erasure code. http://ceph.com/docs/giant/dev/erasure-coded-pool/,
>  we  really have a  difference between k and m  chunks (supposing that m is the number of split and k the number of code chunk),  data  is splitting  in m chunk, and we generate k chunks code based on   the previous m chunks. So these chunks are differents, right?

http://ceph.com/docs/master/rados/operations/erasure-code-profile/

Yes (except for the fact that k are the data chunks and m the coding chunks). 

> 1) first in all documentations about erasure, it says  that the number of coding is greater than the number of data splitted , 

Which documentation writes that ? It is usually not the case and if that's in Ceph documentation it would be good to fix it.

> but  seeing https://wiki.ceph.com/Planning/Blueprints/Dumpling/Erasure_encoding_as_a_storage_backend, I see that the coding chunk number is  not greater than the number of splits. What is the goal,  reducing the used space?
> 
> 2) We allow the loss of k chunks between (n+k) chunks,
>   What's happening  when  you lost a chunk, does ceph rebuilt it somewhere on another osd (not on  the n+k-1 previous one)?

When you try to read and one chunk is missing, it is rebuilt on the fly.

> 3)so is it  important to known where are the  chunks of code to make a failure rules?

Yes. As for replicated pools, choosing the right failure domain is an important architectural decision.

> 4) According to these following rules in crush map the erasure code don't take care  about failure domain rule. Right? ie the ruletset of the pool don't matter, right?

The erasure code encoding / decoding functions does not know about the failure domain. But each erasure coded plugin has a function that creates a sensible crush ruleset based on the parameters it is provided. And the operator can change the failure domain with the ruleset-failure-domain parameters. See the documentation of the jerasure plugin for example:

http://ceph.com/docs/master/rados/operations/erasure-code-jerasure/

> let 's take an example: my failure domain is composed by 3 rooms, so usually,  in a pool with size equal to 3 we have a replicate in  each room. But in erasure coding rule,  we don't have this, does  the  rule applied only on the chunk that contains the code?
>  
> rule replicated_ruleset {
>         ruleset 0
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step chooseleaf firstn 0 type room
>         step emit
> }
> 
> rule erasure-code {
>         ruleset 1
>         type erasure
>         min_size 3
>         max_size 20
>         step set_chooseleaf_tries 5
>         step take default
>         step chooseleaf indep 0 type host
>         step emit
> }
> 
> 
> 5) what do  you think about something like :
> 
> rule room_erasure-code {
>         ruleset 1
>         type erasure
>         min_size 3
>         max_size 20
>         step set_chooseleaf_tries 5
>         step take default
>         step chooseleaf indep 0 type room
>         step emit
> }
> 
> 
> And  an erasure code  with  (m=3, and k=2).
>  and does this settings available with:
> 
> ceph osd erasure-code-profile set failprofile k=2 m=3
> ceph osd erasure-code-profile set failprofile ruleset-root=room_erasure-code
>  and I can lost 2 room without problem, right?
> 
> I would like  to  add a summary of  your answer in documentation, would you help me  on this?
> 
> -- 
> Eric Mourgaya,
> 
> 
> Respectons la planete!
> Luttons contre la mediocrite!

Faisons ça ;-)

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux