On 19. okt. 2016 13:00, Ronny Aasen wrote:
On 06. okt. 2016 13:41, Ronny Aasen wrote:
hello
I have a few osd's in my cluster that are regularly crashing.
[snip]
ofcourse having 3 osd's dying regularly is not good for my health. so i
have set noout, to avoid heavy recoveries.
googeling this error messages gives exactly 1 hit:
https://github.com/ceph/ceph/pull/6946
where it saies: "the shard must be removed so it can be reconstructed"
but with my 3 osd's failing, i am not certain witch of them contain the
broken shard. (or perhaps all 3 of them?)
a bit reluctant to delete on all 3. I have 4+2 erasure coding.
( erasure size 6 min_size 4 ) so finding out witch one is bad would be
nice.
hope someone have an idea how to progress.
kind regards
Ronny Aasen
i again have this problem with crashing osd's. a more detailed log is on
the tail of this mail.
Does anyone have any suggestions on how i can identify what shard that
needs to be removed to allow the EC to recover. ?
and more importantly how i can stop the osd's from crashing?
kind regards
Ronny Aasen
Answering my own question for googleabillity.
using this one liner.
for dir in $(find /var/lib/ceph/osd/ceph-* -maxdepth 2 -type d -name
'5.26*' | sort | uniq) ; do find $dir -name
'*3a3938238e1f29.00000000002d80ca*' -type f -ls ;done
i got a list of all shards of the problematic object.
One of the object had size 0 but was otherways readable without any io
errors. I guess this explains the inconsistent size, but it does not
explain why ceph decides it's better to crash 3 osd's, rather then move
a 0 byte file into a "LOST+FOUND" style directory structure.
Or just delete it, since it will not have any useful data anyway.
Deleting this file (mv to /tmp). allowed the 3 broken osd's to start,
and have been running for >24h now. while usualy they crash within 10
minutes. Yay!
Generally you need to check _all_ shards on the given pg. Not just the 3
crashing. This was what confused me since i only focused on the crashing
osd's
I used the oneliner that checked osd's for the pg since due to
backfilling the pg was spread all over the place. And i could run it
from ansible to reduce tedious work.
Also it would be convinient to be able to mark a broken/inconsistent pg
manually "inactive". Instead of crashing 3 osd's and taking lots of
other pg's with them down. One could set the pg inactive while
troubleshooting, and unset pg-inactive when done. without having osd's
crash and all the following high load rebalancing.
Also i ran a find for 0 size files on that pg and there are multiple
other files. are a 0 byte rbd_data file on a pg a normal occurence, or
can i have more similar problems in the future due to the other 0 size
files ?
kind regards
Ronny Aasen
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com