Re: Combining erasure coding and replication?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I guess what you are suggesting is something like k+m with m>=k+2, for example k=4, m=6. Then, one can distribute 5 shards per DC and sustain the loss of an entire DC while still having full access to redundant storage.

that's exactly what I mean, yes.

Now, a long time ago I was in a lecture about error-correcting codes (Reed-Solomon codes). From what I remember, the computational complexity of these codes explodes at least exponentially with m. Out of curiosity, how does m>3 perform in practice? What's the CPU requirement per OSD?

Such a setup usually would be considered for archiving purposes so the performance requirements aren't very high, but so far we haven't heard any complaints performance-wise.
I don't have details on CPU requirements at hand right now.

Regards,
Eugen


Zitat von Frank Schilder <frans@xxxxxx>:

Dear Eugen,

I guess what you are suggesting is something like k+m with m>=k+2, for example k=4, m=6. Then, one can distribute 5 shards per DC and sustain the loss of an entire DC while still having full access to redundant storage.

Now, a long time ago I was in a lecture about error-correcting codes (Reed-Solomon codes). From what I remember, the computational complexity of these codes explodes at least exponentially with m. Out of curiosity, how does m>3 perform in practice? What's the CPU requirement per OSD?

Best regards,

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Eugen Block <eblock@xxxxxx>
Sent: 27 March 2020 08:33:45
To: ceph-users@xxxxxxx
Subject:  Re: Combining erasure coding and replication?

Hi Brett,

Our concern with Ceph is the cost of having three replicas. Storage
may be cheap but I’d rather not buy ANOTHER 5pb for a third replica
if there are ways to do this more efficiently. Site-level redundancy
is important to us so we can’t simply create an erasure-coded volume
across two buildings – if we lose power to a building, the entire
array would become unavailable.

can you elaborate on that? Why is EC not an option? We have installed
several clusters with two datacenters resilient to losing a whole dc
(and additional disks if required). So it's basically the choice of
the right EC profile. Or did I misunderstand something?


Zitat von Brett Randall <brett.randall@xxxxxxxxx>:

Hi all

Had a fun time trying to join this list, hopefully you don’t get
this message 3 times!

On to Ceph… We are looking at setting up our first ever Ceph cluster
to replace Gluster as our media asset storage and production system.
The Ceph cluster will have 5pb of usable storage. Whether we use it
as object-storage, or put CephFS in front of it, is still TBD.

Obviously we’re keen to protect this data well. Our current Gluster
setup utilises RAID-6 on each of the nodes and then we have a single
replica of each brick. The Gluster bricks are split between
buildings so that the replica is guaranteed to be in another
premises. By doing it this way, we guarantee that we can have a
decent number of disk or node failures (even an entire building)
before we lose both connectivity and data.

Our concern with Ceph is the cost of having three replicas. Storage
may be cheap but I’d rather not buy ANOTHER 5pb for a third replica
if there are ways to do this more efficiently. Site-level redundancy
is important to us so we can’t simply create an erasure-coded volume
across two buildings – if we lose power to a building, the entire
array would become unavailable. Likewise, we can’t simply have a
single replica – our fault tolerance would drop way down on what it
is right now.

Is there a way to use both erasure coding AND replication at the
same time in Ceph to mimic the architecture we currently have in
Gluster? I know we COULD just create RAID6 volumes on each node and
use the entire volume as a single OSD, but that this is not the
recommended way to use Ceph. So is there some other way?

Apologies if this is a nonsensical question, I’m still trying to
wrap my head around Ceph, CRUSH maps, placement rules, volume types,
etc etc!

TIA

Brett

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx


_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux