On Wed, 5 Sep 2018, John Spray said: > On Wed, Sep 5, 2018 at 8:38 AM Marc Roos <M.Roos@xxxxxxxxxxxxxxxxx> wrote: > > > > > > The adviced solution is to upgrade ceph only in HEALTH_OK state. And I > > also read somewhere that is bad to have your cluster for a long time in > > an HEALTH_ERR state. > > > > But why is this bad? See https://ceph.com/community/new-luminous-pg-overdose-protection under "Problems with past intervals" "if the cluster becomes unhealthy, and especially if it remains unhealthy for an extended period of time, a combination of effects can cause problems." "If a cluster is unhealthy for an extended period of time (e.g., days or even weeks), the past interval set can become large enough to require a significant amount of memory." Sean > Aside from the obvious (errors are bad things!), many people have > external monitoring systems that will alert them on the transitions > between OK/WARN/ERR. If the system is stuck in ERR for a long time, > they are unlikely to notice new errors or warnings. These systems can > accumulate faults without the operator noticing. > > > Why is this bad during upgrading? > > It depends what's gone wrong. For example: > - If your cluster is degraded (fewer than desired number of replicas > of data) then taking more services offline (even briefly) to do an > upgrade will create greater risk to the data by reducing the number of > copies available. > - If your system is in an error state because something has gone bad > on disk, then recovering it with the same software that wrote the data > is a more tested code path than running some newer code against a > system left in a strange state by an older version. > > There will always be exceptions to this (e.g. where the upgrade is the > fix for whatever caused the error), but the general purpose advice is > to get a system nice and clean before starting the upgrade. > > John > > > Can I quantify how bad it is? (like with large log/journal file?) > > > > > > > > > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@xxxxxxxxxxxxxx > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com