On 17/01/2014 12:18, Andreas Joachim Peters wrote: > Is k:4 not wrong? I want to build the local parity using 4 data + 2 RS stripes ?!?!? > I misunderstood and did not consider the case where you would want to do this. I'm glad you raise this now :-) Reading http://home.ie.cuhk.edu.hk/~mhchen/papers/pyramid.ToS.13.pdf my understanding is that local parity is not calculated for chunks created by the lower level. Am I reading it incorrectly ? In the context of Ceph I think you're right anyway : local parity needs to apply to chunks generated at the global level. > > { "plugin": "xor", > "k": 4, > "m": 1, > "item": "datacenter", > "mapping": "0000--^1111--^2222--^", > }, > > ________________________________________ > From: Loic Dachary [loic@xxxxxxxxxxx] > Sent: 17 January 2014 12:00 > To: Andreas Joachim Peters > Cc: Ceph Development > Subject: Re: Pyramid Erasure Code plugin (draft) > > On 17/01/2014 11:34, Andreas-Joachim Peters wrote: >> Hi Loic, >> >> i think I don't understand if this works really for all cases and probably sysadmins will be lost without ready to use templates. > > I agree, providing a sensible default is important. I'll draft something. > >> Can you write down with this syntax a rule like this: >> >> => build 12 data chunks (d1...d12) >> => build 6 RS chunks, distribute (p1..p6) >> => arrange them as : lp1=(d1,d2,d3,d4,p1,p2) lp2=(d5,d6,d7,d8,p3,p4) lp3=(d9,d10,d11,d12,p5,p6) >> => map 21 stripes to 3 data center as: D1=(d1,d2,d3,d4,p1,p2,lp1) D2=(d5,d6,d7,d8,p3,p4,lp2) D3=(d9,d10,d11,d12,p5,p6,lp3) >> e.g. chunk(0...21) = (d1,d2,d3...lp1,d5,d6,d7...lp2,d9,d10,d11...lp3) > > Here is how it translates : http://tracker.ceph.com/issues/7146#note-2 ( replacing | with - ... maybe more readable ). > > Does that make sense ? >> >> Thanks, Andreas. >> >> >> >> >> >> >> >> >> On Fri, Jan 17, 2014 at 10:48 AM, Loic Dachary <loic@xxxxxxxxxxx <mailto:loic@xxxxxxxxxxx>> wrote: >> >> Hi Andreas, >> >> I spent some time this week trying to figure out something that would be reasonably generic, readable from the sysadmin point of view and simple to implement. The input of the plugin is here: >> >> http://tracker.ceph.com/issues/7146#note-1 >> >> The json structure describes the pyramid and associates an erasure code method with each layer, including parameters. The mapping describes how chunks relate to the list of OSDs obtained from crush. For instance in |^000111^| the | are ignored ( whitespace is confusing because it's not easy to figure out visually how many of them there are ), ^ marks a coding chunk, any other character is a data chunk. The pyramid encoding function reads this and encode the first three data chunks with one coding chunk. The re-ordering of the chunks is done by the pyramid code and the underlying erasure code method does not need to know anything about it. There is no copy involved, it re-orders pointers ( bufferptr ). >> >> Here is a draft (not compiling not working but the logic looks right to me) implementation: >> >> encode : >> https://github.com/dachary/ceph/blob/wip-pyramid/src/osd/ErasureCodePluginPyramid/ErasureCodePyramid.cc#L250 >> >> decode : >> https://github.com/dachary/ceph/blob/wip-pyramid/src/osd/ErasureCodePluginPyramid/ErasureCodePyramid.cc#L367 >> >> The plugins for each layer would be loaded at init time : >> >> https://github.com/dachary/ceph/blob/wip-pyramid/src/osd/ErasureCodePluginPyramid/ErasureCodePyramid.cc#L83 >> >> with as much consistency checks as possible, for instance: >> >> https://github.com/dachary/ceph/blob/wip-pyramid/src/osd/ErasureCodePluginPyramid/ErasureCodePyramid.cc#L102 >> >> so that runtime can assume constraints are enforced. Please let me know if you see something that does not look right, this is a draft, it can be reworked 100% ;-) >> >> Cheers >> >> -- >> Loïc Dachary, Artisan Logiciel Libre >> >> > > -- > Loïc Dachary, Artisan Logiciel Libre > -- Loïc Dachary, Artisan Logiciel Libre
Attachment:
signature.asc
Description: OpenPGP digital signature