Re: Stuck in upgrade

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jan,

It looks like you got into this situation by not setting
require-osd-release to pacific while you were running 16.2.7.
The code has that expectation, and unluckily for you if you had
upgraded to 16.2.8 you would have had a HEALTH_WARN that pointed out
the mismatch between require_osd_release and the running version:
https://tracker.ceph.com/issues/53551
https://github.com/ceph/ceph/pull/44259

Cheers, Dan

On Fri, Oct 7, 2022 at 10:05 AM Jan Marek <jmarek@xxxxxx> wrote:
>
> Hello,
>
> I've now cluster healthy.
>
> I've studied OSDMonitor.cc file and I've found, that there is
> some problematic logic.
>
> Assumptions:
>
> 1) require_osd_release can be only raise.
>
> 2) ceph-mon in version 17.2.3 can set require_osd_release to
> minimal value 'octopus'.
>
> I have two variants:
>
> 1) If I can set require_osd_release to octopus, I have to have
> set require_osd_release actually to 'nautilus' (I will raise
> require_osd_release from nautilus to octopus). Then I have to
> have on line 11618 in OSDMonitor.cc this line:
>
> ceph_assert(osdmap.require_osd_release >= ceph_release_t::nautilus);
>
> 2) If I would have to preserve on line 11618 in file
> OSDMonitor.cc line:
>
> ceph_assert(osdmap.require_osd_release >= ceph_release_t::octopus);
>
> it is nonsense to can set require_osd_release parameter to
> 'octopus', because this line ensures, that I alredy set
> require_osd_release parameter to octopus.
>
> I suggest to use variant 1) and I've sendig attached patch.
>
> There is another question, if MON daemon have to check
> require_osd_release, when it is joining to the cluster, when it
> cannot raise it's value.
>
> It is potentially dangerous situation, see my old e-mail below...
>
> Sincerely
> Jan Marek
>
> Dne Po, říj 03, 2022 at 11:26:51 CEST napsal Jan Marek:
> > Hello,
> >
> > I've problem with our ceph cluster - I've stucked in upgrade
> > process between versions 16.2.7 and 17.2.3.
> >
> > My problem is, that I have upgraded MON, MGR, MDS processes, and
> > when I started upgrade OSDs, ceph tell me, that I cannot add OSD
> > with that version to cluster, because I have problem with
> > require_osd_release.
> >
> > In my osdmap I have:
> >
> > # ceph osd dump | grep require_osd_release
> > require_osd_release nautilus
> >
> > When I tried set this to octopus or pacific, my MON daemon crashed with
> > assertion:
> >
> > ceph_assert(osdmap.require_osd_release >= ceph_release_t::octopus);
> >
> > in OSDMonitor.cc on line 11618.
> >
> > Please, is there a way to repair it?
> >
> > Can I (temporary) change ceph_assert to this line:
> >
> > ceph_assert(osdmap.require_osd_release >= ceph_release_t::nautilus);
> >
> > and set require_osd_release to, say, pacific?
> >
> > I've tried to downgrade ceph-mon process back to version 16.2,
> > but it cannot join to cluster...
> >
> > Sincerely
> > Jan Marek
> > --
> > Ing. Jan Marek
> > University of South Bohemia
> > Academic Computer Centre
> > Phone: +420389032080
> > http://www.gnu.org/philosophy/no-word-attachments.cs.html
>
>
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>
>
> --
> Ing. Jan Marek
> University of South Bohemia
> Academic Computer Centre
> Phone: +420389032080
> http://www.gnu.org/philosophy/no-word-attachments.cs.html
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux