Re: Erasure encoding as a storage backend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 05/04/2013 08:47 PM, Noah Watkins wrote:
> 
> On May 4, 2013, at 11:36 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
> 
>>
>>
>> On 05/04/2013 08:27 PM, Noah Watkins wrote:
>>>
>>> On May 4, 2013, at 10:16 AM, Loic Dachary <loic@xxxxxxxxxxx> wrote:
>>>
>>>> it would be great to get feedback before the ceph summit to address the most prominent issues.
>>>
>>> One thing that has been in the back of my mind is how this proposal is influenced (if at all) by a future that includes declustered per-file raid in CephFS. I realize that may be a distant future, but it seems as though there could be a lot of overlap for the (non-client driven) rebuild/recovery component of such an architecture.
>>
>> Hi Noah,
>>
>> I'm not sure what declustered per-file raid is, which means it had no influence on this proposal ;-) Would you be so kind as to educate me ?
> 
> I'm definitely far from an expert on the topic. But briefly the way I think about it is:
> 
> Currently CephFS stripes a file byte stream across a set of objects (e.g. first MB in object 0, 2nd in object 1, etc..), and each of these objects is in turn replicated. Following a failure, PGs re-replicate objects.
> 
> In client drive raid the striping algorithm is changed, and clients are calculating and distributing parity. In this case the parity rather than replication provides redundancy. So, one might consider storing objects in a pool with replication size 1. However, the standard PG that does replication wouldn't be able to handle faults correctly (parity rebuild, rather than re-replication), and a smart PG like the ErasureCodedPG would be needed.
> 
> So it seems like the problems are related, but I'm not sure exactly how much overlap there is :)

Do you refer to http://ceph.com/docs/master/architecture/#how-ceph-clients-stripe-data when talking about client drive raid ? My understanding is that it is designed to maximize throughout. This is done in the client library ( gateway, rbd or cephfs ). Since erasure encoding is about recovering from failures and would be implemented in libosd ( next to ReplicatedPG ), I am under the impression that there is no overlap.

What do you think ?

> 
> -Noah
> 
> 
>> Cheers
>>
>>> -Noah
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> -- 
>> Loïc Dachary, Artisan Logiciel Libre
>> All that is necessary for the triumph of evil is that good people do nothing.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Loïc Dachary, Artisan Logiciel Libre
All that is necessary for the triumph of evil is that good people do nothing.


Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux