Re: 14.2.5 QE Nautilus validation status

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 6, 2019 at 6:58 AM Abhishek Lekshmanan <abhishek@xxxxxxxx> wrote:
>
> Sage Weil <sweil@xxxxxxxxxx> writes:
>
> > On Fri, 6 Dec 2019, Alfredo Deza wrote:
> >> I would like to raise once again the discussion of improving our
> >> release process so that we can minimize these things happening at the
> >> very last minute.
> >>
> >> This isn't the first time we've hit issues that had to be reverted
> >> (luckily packages didn't get to download.ceph.com), and since we
> >> insist on owning the package and repository creation, there
> >> is no way we can retire a "bad" release that is available publicly.
> >
> > Doesn't the fact that we own the distribution mean that we *can* yank a
> > bad release?  If we were relying on downstreams to do this for us we'd
> > have no control at all.  Currently the only reason we can't is because
> > we're chafing against the tools.
> >
> >> A few of the items that I believe could improve are:
> >>
> >> * Start having stricter development and freeze periods
> >> * Add a quiet period (or waiting time) of a week for a release to be public
> >> * Determine what the SHA to release is, so that testing can be done
> >> against that SHA
> >> * Avoid depending on back and forth emails asking leads if they are OK
> >> with a release or if they need another PR to be merged - let's define
> >> criteria to meet this.
> >
> > I think the main missing piece here is deploying the release candidate
> > into production before actually releasing.  Most of these last minute
> > issues were caught when we deployed on the lab cluster and my home
> > cluster.   I think this is similar to your 'quiet' period: once qa passes,
> > we should deploy internally and make sure there aren't things the qa suite
> > missed.
> >
> > The other problem was discovered in parallel by David and was totally
> > unrelated to the release process/freeze/testing, but was deemed to be a
> > release blocker.  I don't think any process can avoid that.
> >
> > This release is a bit different, though, because there is a data
> > corruption bug we really want to get out.  Perhaps what we should have
> > done here is a minimal release that only has that one bug fix so that our
> > regression risk is basically 0 and we can expedite.  Instead we grabbed
> > everything that had already been tested and merged and went with that.

I agree that this should have been a minimal release with just the
bluestore bug fix. That's what we were trying to push for when we
wanted to expedite 14.2.5 couple of weeks ago. However, the fact that
14.2.5 already had a bunch of fixes(14.2.4 released in early
September) led to more follow-on bug fixes that we wanted to deliver
in 14.2.5. One could argue that those "follow-on bug fixes" could land
in 14.2.6, if we did a timely 14.2.5 release. However, since this time
we were able to catch a few annoying bugs ourselves that would have
been reported by users anyway at some point, I think we are better off
pushing the release by a few more days to avoid bad impression and
having to do an immediate 14.2.6 to deal with those.

In no way does the above mean that we should not have stricter
deadline for freezes.

We've been using email, github and IRC to inform everyone when we
caught such blocker blogs. Is there a better medium(or common forum)
that we can use to bring everyone on the same page when such things
happen?

Neha


Neha





>
> We currently merge to the release branch regularly after QE validation,
> we could go to merges to {release}-next branches and only merge to
> {release} after final QE validation and approval, this makes expedited
> releases easier since we could merge a different branch containing only
> fixes and merge that to the release branch, but workflow tools would need adoption
> >
> > sage
> >
> >
> >>
> >>
> >> On Thu, Dec 5, 2019 at 1:09 PM Neha Ojha <nojha@xxxxxxxxxx> wrote:
> >> >
> >> > https://github.com/ceph/ceph/pull/32018 has merged, we should be ready
> >> > to build 14.2.5 now.
> >> >
> >> > On Wed, Dec 4, 2019 at 5:02 PM Neha Ojha <nojha@xxxxxxxxxx> wrote:
> >> > >
> >> > > David Zafman has discovered a buggy patch in nautilus, which we want
> >> > > to revert before releasing 14.2.5. More details in
> >> > > https://github.com/ceph/ceph/pull/31970#issuecomment-561913632. The
> >> > > revert PR https://github.com/ceph/ceph/pull/32018 is being tested now.
> >> > > We'll need to rebuild 14.2.5 once the revert merges.
> >> > >
> >> > > Sorry about the inconvenience.
> >> > >
> >> > > Neha
> >> > >
> >> > > On Wed, Dec 4, 2019 at 11:28 AM Sage Weil <sweil@xxxxxxxxxx> wrote:
> >> > > >
> >> > > > On Wed, 4 Dec 2019, Abhishek Lekshmanan wrote:
> >> > > > > Yuri Weinstein <yweinste@xxxxxxxxxx> writes:
> >> > > > >
> >> > > > > > David, assuming Sage is OK with `ceph-deploy` and overall tests
> >> > > > > > results, this is ready for publishing.
> >> > > > >
> >> > > > > Sage, is this ready to build and start publishing packages for?
> >> > > >
> >> > > > Yeah, I think we're good to go!
> >> > > >
> >> > > > Thanks everyone-
> >> > > > sage
> >> > > >
> >> > > > > > Abhishek, Nathan FYI
> >> > > > > >
> >> > > > > > On Wed, Nov 27, 2019 at 12:38 PM Neha Ojha <nojha@xxxxxxxxxx> wrote:
> >> > > > > >>
> >> > > > > >>
> >> > > > > >>
> >> > > > > >> On Wed, Nov 27, 2019 at 8:01 AM Yuri Weinstein <yweinste@xxxxxxxxxx> wrote:
> >> > > > > >>>
> >> > > > > >>> Outstanding need approval:
> >> > > > > >>>
> >> > > > > >>> ceph-deploy - Sage
> >> > > > > >>> upgrade/luminous-x (nautilus) - Neha, Josh reviewing
> >> > > > > >>
> >> > > > > >> approved
> >> > > > > >>
> >> > > > > >>>
> >> > > > > >>> upgrade/mimic-x (nautilus) - Neha, Josh reviewing
> >> > > > > >>
> >> > > > > >> approved, failure tracked in https://tracker.ceph.com/issues/43048
> >> > > > > >>
> >> > > > > >>>
> >> > > > > >>> On Fri, Nov 22, 2019 at 7:22 AM Yuri Weinstein <yweinste@xxxxxxxxxx> wrote:
> >> > > > > >>> >
> >> > > > > >>> > (This is an early update, some tests are still running, as we are
> >> > > > > >>> > trying to release this point next week before the US holidays, and
> >> > > > > >>> > have more time to review results)
> >> > > > > >>> >
> >> > > > > >>> > Details of this release summarized here:
> >> > > > > >>> > https://tracker.ceph.com/issues/42839#note-3
> >> > > > > >>> >
> >> > > > > >>> > rados - approved by Neha
> >> > > > > >>> > rgw - approved by Casey
> >> > > > > >>> > rbd - need approval Jason
> >> > > > > >>> > krbd - need approval Jason, Ilya
> >> > > > > >>> > fs - need approval Patrick, Ramana
> >> > > > > >>> > kcephfs - need approval Patrick, Ramana
> >> > > > > >>> > multimds - need approval Patrick, Ramana
> >> > > > > >>> > ceph-deploy - FAILED Sage, Alfredo ?
> >> > > > > >>> > ceph-disk - N/A
> >> > > > > >>> > upgrade/client-upgrade-hammer (nautilus) - N/A
> >> > > > > >>> > upgrade/client-upgrade-jewel (nautilus) - PASSED
> >> > > > > >>> > upgrade/client-upgrade-mimic (nautilus) - FAILED
> >> > > > > >>> > upgrade/luminous-p2p - in progress
> >> > > > > >>> > powercycle - in progress
> >> > > > > >>> > ceph-ansible - Brad is finxing
> >> > > > > >>> > upgrade/luminous-x (nautilus) - in progress
> >> > > > > >>> > upgrade/mimic-x (nautilus) - in progress
> >> > > > > >>> > ceph-volume - Jan fixing
> >> > > > > >>> > (please speak up if something is missing)
> >> > > > > >>> >
> >> > > > > >>> > Thx
> >> > > > > >>> > YuriW
> >> > > > > >>>
> >> > > > > > _______________________________________________
> >> > > > > > Dev mailing list -- dev@xxxxxxx
> >> > > > > > To unsubscribe send an email to dev-leave@xxxxxxx
> >> > > > >
> >> > > > > --
> >> > > > > Abhishek Lekshmanan
> >> > > > > SUSE Software Solutions Germany GmbH
> >> > > > > GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
> >> > > > > _______________________________________________
> >> > > > > Dev mailing list -- dev@xxxxxxx
> >> > > > > To unsubscribe send an email to dev-leave@xxxxxxx
> >> > > > > _______________________________________________
> >> > > > Dev mailing list -- dev@xxxxxxx
> >> > > > To unsubscribe send an email to dev-leave@xxxxxxx
> >> >
> >>
> >>
>
> --
> Abhishek Lekshmanan
> SUSE Software Solutions Germany GmbH
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG Nürnberg)
> _______________________________________________
> Dev mailing list -- dev@xxxxxxx
> To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx




[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux