health_err on osd full

greg@xxxxxxxxxxx (Gregory Farnum) · Fri, 18 Jul 2014 15:24:38 -0700



Yes, that's expected behavior. Since the cluster can't move data
around on its own, and lots of things will behave *very badly* if some
of their writes go through but others don't, the cluster goes
read-only once any OSD is full. That's why nearfull is a warn
condition; you really want to even out the balance well before it gets
to that point. A cluster at 65% and a single OSD at 95% is definitely
*not* normal, so you seem to be doing something wrong or out of the
ordinary. (A variance of 20% from fullest to emptiest isn't too
unusual, but 30% from fullest to *average* definitely is.)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com


On Fri, Jul 18, 2014 at 3:15 PM, James Eckersall
<james.eckersall at gmail.com> wrote:
> Hi,
>
> I have a ceph cluster running on 0.80.1 with 80 OSD's.
>
> I've had fairly uneven distribution of the data and have been keeping it
> ticking along with "ceph osd reweight XX 0.x" commands on a few OSD's while
> I try and increase the pg count of the pools to hopefully better balance the
> data.
>
> Tonight, one of the OSD's filled up to 95% so was marked as "full".
>
> This caused the cluster to be flagged as "full" and the server mapping the
> rbd's hit a nice loadavg of over 800.  This was rebooted and I was unable to
> map any rbd's.
> I've tweaked the reweight of the "full" OSD down and that is now "near
> full".
> As soon as that OSD changed state to "near full", the cluster changed status
> to HEALTH_WARN and I'm able to map rbd's again.
>
> I was of the opinion that a full OSD would just prevent data from being
> written to that OSD, not the near catastrophic cluster unavailability that
> I've experienced.
>
> The cluster is around 65% full of data, so there is really plenty of space
> across other OSD's.
>
> Can anyone please clarify exactly whether this behaviour is normal?
>
> Regards
>
> J
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>