Re: Rados bench with a failed node

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Sage,
You are correct! The problem is indeed only for my erasure code,
however I'm not sure what it is.
I've reproduced the issue in the virtual cluster created by vstart.sh
 (logging enabled)
I've compared the logs of all OSDs with my erasure code versus a
similar implementation of lrc - however there were no visible
differences.
Any tips on what should I look for?
What might affect this in erasure code source code?

Another iteresting thing I'm seeing is that even cleanup fails:

>  bin/rados -p optlrcpool cleanup
error during cleanup: -5
error 5: (5) Input/output error

This only occurs when I set 'noout' & stop and osd and then write and
read with the benchmark.


I should mentioned that my erasure code works properly besides this
experiment. It manages to recover from a failed node during normal
mode of operation.

Thanks,
Oleg

On Tue, Aug 1, 2017 at 4:21 PM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Tue, 1 Aug 2017, Oleg Kolosov wrote:
>> Hi
>> I'm developing my own erasure code and I'm trying to run rados
>> benchmark on it. My experiment consists of writing objects, then while
>> reading them I stop an OSD and see how it affects latency.
>> While running this experiment with 'rados bench' I get an error after
>> stopping the osd:
>>
>> benchmark_data_adm-1_3806_object8 is not correct!
>> read got -5
>> error during benchmark: -5
>> error 5: (5) Input/output error
>>
>> Is it expected to fail like this?
>
> It's hard to say what's happening without logs, but my guess would be that
> your erasure code isn't correctly reconstructing the stripe when a
> shard is missing.
>
>> Is there a way to bypass it or somehow perform my experiment?
>
> If it's what I'm guessing, not really... nor would you want to.  I would
> enable osd logging to see where the EIO is coming from.
>
> sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux