health_err on osd full

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I have a ceph cluster running on 0.80.1 with 80 OSD's.

I've had fairly uneven distribution of the data and have been keeping it
ticking along with "ceph osd reweight XX 0.x" commands on a few OSD's while
I try and increase the pg count of the pools to hopefully better balance
the data.

Tonight, one of the OSD's filled up to 95% so was marked as "full".

This caused the cluster to be flagged as "full" and the server mapping the
rbd's hit a nice loadavg of over 800.  This was rebooted and I was unable
to map any rbd's.
I've tweaked the reweight of the "full" OSD down and that is now "near
full".
As soon as that OSD changed state to "near full", the cluster changed
status to HEALTH_WARN and I'm able to map rbd's again.

I was of the opinion that a full OSD would just prevent data from being
written to that OSD, not the near catastrophic cluster unavailability that
I've experienced.

The cluster is around 65% full of data, so there is really plenty of space
across other OSD's.

Can anyone please clarify exactly whether this behaviour is normal?

Regards

J
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140718/2093f03c/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux