We have a report that, due to an OSD bug, it is possible to fill a
cluster (using "rados bench write") to the point that some OSDs reach
100% full.
The issue is http://tracker.ceph.com/issues/16878
The report came from a cluster that had 24 OSDs, each 1 TB in size and
with an 87 GB journal on an external partition.
I tried to reproduce this on a much smaller cluster (2 OSDs, 10 GB data
partition, 1 GB journal partition) using teuthology-openstack. I was
able to reliably fill this cluster to 98% (past the 95% FULL mark) when
I constructed the journal partitions so they do not end on 2048-sector
boundary.
(When the journals are "evenly sized", the cluster fills up to 95% and
usage does not rise further.)
I would be grateful if someone who is much more familiar with the OSD
code than I am (and that is not a difficult criterion to meet!) could
look at the bug report - I have posted logs and a detailed analysis.
Thanks for your time!
--
Nathan Cutler
Software Engineer Distributed Storage
SUSE LINUX, s.r.o.
Tel.: +420 284 084 037
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html