Re: Why set osd flag to noout during upgrade ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yeah you don't want to deal with backfilling while the cluster is
upgrading. At best it can delay the upgrade, at worst mixed version
backfilling has (rarely) caused issues in the past.

We additionally `set noin` and disable the balancer: `ceph balancer off`.
The former prevents broken osds from re-entering the cluster, and both of
these similarly prevent backfilling from starting mid-upgrade.


.. Dan


On Wed, 22 Sep 2021, 12:18 Etienne Menguy, <etienne.menguy@xxxxxxxx> wrote:

> Hello,
>
> From my experience, I see three reasons :
> - You don’t want to recover data if you already have them on a down OSD,
> rebalancing can have a big impact on performance
> - If upgrade/maintenance goes wrong you will want to focus on this issue
> and not have to deal with things done by Ceph meanwhile.
> - During an upgrade you have an ‘unusual’ cluster with different versions.
> It’s supposed to work, but you probably want to keep it ‘boring’.
>
> -
> Etienne Menguy
> etienne.menguy@xxxxxxxx
>
>
>
>
> > On 22 Sep 2021, at 11:51, Francois Legrand <fleg@xxxxxxxxxxxxxx> wrote:
> >
> > Hello everybody,
> >
> > I have a "stupid" question. Why is it recommended in the docs to set the
> osd flag to noout during an upgrade/maintainance (and especially during an
> osd upgrade/maintainance) ?
> >
> > In my understanding, if an osd goes down, after a while (600s by
> default) it's marked out and the cluster will start to rebuild it's content
> elsewhere in the cluster to maintain the redondancy of the datas. This
> generate some transfer and load on other osds, but that's not a big deal !
> >
> > As soon as the osd is back, it's marked in again and ceph is able to
> determine which data is back and stop the recovery to reuse the unchanged
> datas which are back. Generally, the recovery is as fast as with noout flag
> (because with noout, the data modified during the down period still have be
> copied to the back osd).
> >
> > Thus is there an other reason apart from limiting the transfer and
> others osds load durind the downtime ?
> >
> > F
> >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux