On 10/25/2017 05:16 AM, Sage Weil wrote:
Hi Xingguo,
On Wed, 25 Oct 2017, xie.xingguo@xxxxxxxxxx wrote:
I wonder why erasure-pools can not support omap currently.
The simplest way for erasure-pools to support omap I can figure out would be duplicating omap on every shard.
It is because it consumes too much space when k + m gets bigger?
Right. There isn't a nontrivial way to actually erasure code it, and
duplicating on every shard is inefficient.
One reasonableish approach would be to replicate the omap data on m+1
shards. But it's a bit of work to implement and nobody has done it.
I can't remember if there were concerns with this approach or it was just
a matter of time/resources... Josh? Greg?
It restricts us to erasure codes like reed-solomon where a subset of
shards are always updated. I think this is a reasonable trade-off
though, it's just a matter of implementing it. We haven't written
up the required peering changes, but they did not seem too difficult to
implement.
Some notes on the approach are here - just think of 'replicating omap'
as a partial write to m+1 shards:
http://pad.ceph.com/p/ec-partial-writes
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html