osd full still writing data while cluster recovering

handong He <hedongho@xxxxxxxxx> · Wed, 28 Jun 2017 17:57:23 +0800

Hello,

I'm using ceph-jewel 10.2.7 for some test.
Discovered that when an osd is full(like full_ratio=0.95), client
write failed, which is normal. But a full osd cannot stop a recovering
cluster writing data, make osd used ratio from 95% to100%. When that
happen, osd will be down for no space left and cannot startup anymore.

So the question is : can the cluster auto stop recovering while osd is
reaching full without setting the norecover flag manually?  Or is it
already fix in the latest version?

Consider this situation: a half-full cluster with many osds. For some
bad luck(netlink down| server down | or others) in midnight, some osds
down|out and trigger cluster recovery, makes some  other health osds'
used% to 100% (experienceless in operation and maintenance, please fix
me if i'm wrong). Unluckly, this just like a plague and make much more
osds down. It maybe easy to fix one down osd like that, but a disaster
to fix 10+ osds with 100% space used.

here is my test environment and steps:

three nodes, each node has one monitor and one osd(10G hdd for
convenient), running in vm.
ceph conf is basic.
pool size set to 2.
using 'rados bench' writing data to osds.

1. exec command  to set osd full ratio:
# ceph pg set_full_ratio 0.8
# ceph pg set nearfull_ratio 0.7

2. writing data, when an osd is reaching full, stop writing and mark
out one osd with command:
# ceph osd out 0

3. waiting for cluster recovering finished , and exec command:
# ceph osd df
# ceph osd tree

we can find that other osds is down.

Thanks and Best Regards！

He Handong
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html