Re: pg log hard limit upgrade bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neha,

We with Josh testing mods related to this here
https://github.com/ceph/ceph/pull/24938
And will do the same for mimic-x in preparation for nautilus, if sounds good?

Thx
YuriW
On Mon, Nov 5, 2018 at 3:37 PM Neha Ojha <nojha@xxxxxxxxxx> wrote:
>
> Nathan, I don't think we want to revert it for 13.2.2.
>
> This is because the pg log hard limit feature currently doesn't seem
> to work well in a partial upgrade, recovery/backfill scenario. So,
> even if we do revert it in 13.2.3, this still leaves us a chance of
> going into a split scenario, where some osds in the field, running
> 13.2.2(with hard limit code) and others on 13.2.3(without the code),
> may encounter http://tracker.ceph.com/issues/36686.
>
> Therefore, users who have succesfully upgraded to 13.2.2, shouldn't be
> at any risk.
> For users trying to upgrade to a version >= 13.2.2, I am going to make
> a note of this issue and add the suggested workaround in Pending
> Release Notes for mimic.
>
> Does that make sense?
>
> Thanks,
> Neha
>
> On Mon, Nov 5, 2018 at 2:43 PM, Yuri Weinstein <yweinste@xxxxxxxxxx> wrote:
> > Acknowledged
> >
> > On Mon, Nov 5, 2018 at 2:35 PM Nathan Cutler <ncutler@xxxxxxx> wrote:
> >>
> >> Thanks, Neha. The luminous revert was just merged and we'll cut 12.2.10
> >> to push it out to users.
> >>
> >> Regarding Mimic, will there be a revert there as well? Since the pg hard
> >> limit patches are present in 13.2.2, it sounds like we'll need to revert
> >> them before we release 13.2.3?
> >>
> >> (Note that Yuri was planning to start QE for 13.2.3 - Yuri, please hold
> >> off on that for now?)
> >>
> >> Nathan
> >>
> >> On 11/5/18 6:50 PM, Neha Ojha wrote:
> >> > Hi All,
> >> >
> >> > We have discovered an issue with the pg log hard limit
> >> > patches(https://github.com/ceph/ceph/pull/23211,
> >> > https://github.com/ceph/ceph/pull/24308), where a partial upgrade
> >> > during backfill, can cause the osds on the previous version, to fail
> >> > with "assert(trim_to <= info.last_complete)". Full description of the
> >> > bug is here: http://tracker.ceph.com/issues/36686.
> >> >
> >> > These changes are in 13.2.2 and 12.2.9, and a workaround for users is
> >> > to upgrade and restart all OSDs to a version with the pg hard limit,
> >> > or only upgrade when all PGs are active+clean.
> >> >
> >> > Until we add capability to have the pg log hard limit work smoothly in
> >> > the upgrade case, we will be reverting these changes,
> >> > https://github.com/ceph/ceph/pull/24903, and releasing 12.2.10 as
> >> > early as possible.
> >> >
> >> > We are also reverting https://github.com/ceph/ceph/pull/24902, which
> >> > is a low impact bug, but might causes issues in the field.
> >> >
> >> > Sorry for any inconvenience caused due to this.
> >> >
> >> > Thanks,
> >> > Neha
> >> >
> >>
> >> --
> >> Nathan Cutler
> >> Software Engineer Distributed Storage
> >> SUSE LINUX, s.r.o.
> >> Tel.: +420 284 084 037



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux