Re: CEPH Erasure Encoding + OSD Scalability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 26/09/2013 23:49, Andreas Joachim Peters wrote:> Sure, 
> this text is clear, but it does not talk about the cost of reconstruction e.g. not to select a data chunk but a parity chunk costs CPU and increases latency, but is not reflected by the external cost parameter e.g. if you have RS (3,2), 3 data and 2 parity chunks with chunks [0,1,2,3,4] with equal cost values,  I would select [0,1,2] since it avoids computation, however the retrieval cost for [2,3,4] would be the same but the computational cost is higher.

The implementation knows about the computational cost already and is able to figure out that [0,1,2] is going to be cheaper. It does not need input from the caller and the minimum_to_decode method (without the cost)
https://github.com/ceph/ceph/blob/master/src/osd/ErasureCodePluginJerasure/ErasureCodeJerasure.cc#L45
does this. If you want to read [0,1,2] and have [0,1,2,3,4] available it will return that you need to retreive [0,1,2] and not [2,3,4] although both would allow to get the content of [0,1,2].

> 
> Now if [0] has for example the double cost compared to chunk [3], it is not clear to me if [1,2,3] is a better set than [0,1,2] ... is the meaning of a higher cost actually more a binary flag saying 'avoid to read this chunk if possible' ? 
> 
> Could you give a practical example when a chunk can have a higher cost in a CEPH setup and a rough range for the 'cost' parameter?

At the moment I can't because it depends on the implementation of the erasure code placement group and it's not complete yet. You are correct : the interpretation of the cost by the plugin cannot be fully described without an intimate knowledge of the implementation. It also means that if the implementation of the caller changes, the semantic of the cost will change an may require a different strategy.

Cheers

> Thanks Andreas.
> 
> 
> 
> 
> ________________________________________
> From: Loic Dachary [loic@xxxxxxxxxxx]
> Sent: 26 September 2013 21:18
> To: Andreas Joachim Peters
> Cc: Ceph Development
> Subject: Re: CEPH Erasure Encoding + OSD Scalability
> 
> [re-adding ceph-devel to the cc]
> 
> On 26/09/2013 20:36, Andreas-Joachim Peters wrote:> Hi Loic,
>> today I forked he CEPH repository and will commit my changes to my GitHub fork asap ... (I am not familiar with GitHub in particular).
>> I was finalizing the minimim_to_decode function today with test cases (it is more sophisticated in this case ...) ... I didn't fully get what the 'with cost' function is supposed to do diffrent from the one without cost?
> 
> I'd be happy to explain if
> https://github.com/ceph/ceph/blob/master/src/osd/ErasureCodeInterface.h#L131
> is unclear. Would you be so kind as to tell me what is confusing in the description ?
> 
>>
>>
>> Cheers Andreas.
>>
>> On Wed, Sep 25, 2013 at 8:48 PM, Loic Dachary <loic@xxxxxxxxxxx <mailto:loic@xxxxxxxxxxx>> wrote:
>>
>>
>>
>>     On 25/09/2013 20:33, Andreas Joachim Peters wrote:> Yes, sure. I actually thought the same in the meanwhile ...  I have some questions:
>>     >
>>     > Q: Can/should it stay in the framework of google test's or you would prefer just a plain executable ?
>>     >
>>
>>     A plain executable would make sense. An simple example from src/test/Makefile.am :
>>
>>     ceph_test_trans_SOURCES = test/test_trans.cc
>>     ceph_test_trans_LDADD = $(LIBOS) $(CEPH_GLOBAL)
>>     bin_DEBUGPROGRAMS += ceph_test_trans
>>
>>
>>     > I have added local parity support to your erasure class adding a new argument: "erasure-code-lp" and
>>     > two new methods:
>>     >
>>     > localparity_encode(...)
>>     > localparity_decode(...)
>>     >
>>     > I made a more complex benchmark of (8,2) + 2 local parities (1^2^3^4, 5^6^7^8) which benchmarks performance of encoding/decoding as speed & effective write-latency for three cases (each for liberation & cauchy_good codecs):
>>     >
>>     > 1 (8,2)
>>     > 2 (8,2,lp=2)
>>     > 3 (8,2,lp=2) + crc32c (blocks)
>>     >
>>     > and several failure scenarios ... single, double, triple disk failures. Probably the best is if I make all this parameters configurable.
>>
>>     Great :-) Do you have a public git repository where I could clone this & give it a try ?
>>
>>     > Q: For the local parity implementation .... shall I inherit from your erasure plugin and overwrite the encode/decode method or you would consider a patch to the original class?
>>
>>     It is a perfect timing for a patch to the original class.
>>
>>     > I have also a 128-bit XOR implementation for the local parities. This will work with new gcc's & clang compilers ...
>>     >
>>     > Q: Which compilers/platforms are supported by CEPH? Is there a minimal GCC version?
>>
>>     You can see all supported platforms here:
>>
>>     http://ceph.com/gitbuilder.cgi
>>
>>     I don't think the GCC version shows in the logs but you can probably figure it out from the corresponding distribution.
>>
>>     > Q: is there some policy restricting comments within code? In general I see very few or no comments within the code ..
>>
>>     :-) The mon code tends to be more heavily commented than the osd code (IMO) but I'm not aware of any policy. When I feel the need to comment, I write a unit test. If the unit test is difficult, I tend to comment to clarify its purpose. The problem with comments is that they quickly become obsolete and/or misleading. That being said, I don't think anyone will object if you heavily comment your code.
>>
>>     Cheers
>>
>>     > Cheers Andreas.
>>     >
>>     >
>>     >
>>     >
>>
>>     --
>>     Loïc Dachary, Artisan Logiciel Libre
>>     All that is necessary for the triumph of evil is that good people do nothing.
>>
>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> All that is necessary for the triumph of evil is that good people do nothing.
> 

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux