I can't remember the details now, but I know that recovery needed
additional work. If it were a simple fix
I would have done it when implementing that code.
I found this bug related to recovery and ec errors
(http://tracker.ceph.com/issues/13493)
BUG #13493: osd: for ec, cascading crash during recovery if one shard is
corrupted
David
On 12/4/15 2:03 AM, Markus Blank-Burian wrote:
Hi David,
I am using ceph 9.2.0 with an erasure coded pool and have some problems with
missing objects.
Reads for degraded/backfilling objects on an EC pool, which detect an error
(-2 in my case) seem to be aborted immediately instead of reading from the
remaining shards. Why is there an explicit check for "!rop.for_recovery" in
ECBackend::handle_sub_read_reply? Would it be possible to remove this check
and let the recovery read be completed from the remaining good shards?
Markus
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html