Nathan, I don't think we want to revert it for 13.2.2. This is because the pg log hard limit feature currently doesn't seem to work well in a partial upgrade, recovery/backfill scenario. So, even if we do revert it in 13.2.3, this still leaves us a chance of going into a split scenario, where some osds in the field, running 13.2.2(with hard limit code) and others on 13.2.3(without the code), may encounter http://tracker.ceph.com/issues/36686. Therefore, users who have succesfully upgraded to 13.2.2, shouldn't be at any risk. For users trying to upgrade to a version >= 13.2.2, I am going to make a note of this issue and add the suggested workaround in Pending Release Notes for mimic. Does that make sense? Thanks, Neha On Mon, Nov 5, 2018 at 2:43 PM, Yuri Weinstein <yweinste@xxxxxxxxxx> wrote: > Acknowledged > > On Mon, Nov 5, 2018 at 2:35 PM Nathan Cutler <ncutler@xxxxxxx> wrote: >> >> Thanks, Neha. The luminous revert was just merged and we'll cut 12.2.10 >> to push it out to users. >> >> Regarding Mimic, will there be a revert there as well? Since the pg hard >> limit patches are present in 13.2.2, it sounds like we'll need to revert >> them before we release 13.2.3? >> >> (Note that Yuri was planning to start QE for 13.2.3 - Yuri, please hold >> off on that for now?) >> >> Nathan >> >> On 11/5/18 6:50 PM, Neha Ojha wrote: >> > Hi All, >> > >> > We have discovered an issue with the pg log hard limit >> > patches(https://github.com/ceph/ceph/pull/23211, >> > https://github.com/ceph/ceph/pull/24308), where a partial upgrade >> > during backfill, can cause the osds on the previous version, to fail >> > with "assert(trim_to <= info.last_complete)". Full description of the >> > bug is here: http://tracker.ceph.com/issues/36686. >> > >> > These changes are in 13.2.2 and 12.2.9, and a workaround for users is >> > to upgrade and restart all OSDs to a version with the pg hard limit, >> > or only upgrade when all PGs are active+clean. >> > >> > Until we add capability to have the pg log hard limit work smoothly in >> > the upgrade case, we will be reverting these changes, >> > https://github.com/ceph/ceph/pull/24903, and releasing 12.2.10 as >> > early as possible. >> > >> > We are also reverting https://github.com/ceph/ceph/pull/24902, which >> > is a low impact bug, but might causes issues in the field. >> > >> > Sorry for any inconvenience caused due to this. >> > >> > Thanks, >> > Neha >> > >> >> -- >> Nathan Cutler >> Software Engineer Distributed Storage >> SUSE LINUX, s.r.o. >> Tel.: +420 284 084 037