Re: [PROPOSAL] No FEs X hours before Go/No Meeting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-10-28 at 12:29 -0400, Ben Cotton wrote:
> In yesterday's F33 emergency brake meeting[1], one of the issues that
> came up is the short window for testing all of the different IoT
> hardware. One way to reduce that effort (by extending the window) is
> reducing the change set in the days before an RC. I don't know if it
> would have helped in this particular case, as the kernel update that
> caused some of the problems had security fixes, so we may have left it
> in anyway. But the general idea is if we don't allow FEs all the way
> up to the RC compose, we reduce the number of things that could break.
> 
> I don't know if I personally endorse this proposal or not, but in the
> interests of open discussion, I am proposing a change to the freeze
> exception process:
> 
> > Updates with an accepted Freeze Exceptions will not be included less than X hours before the scheduled start of the Go/No-Go Meeting.
> 
> I left the time unspecified for now, because we should figure out if
> we like the general concept before we decide on the specific deadline.
> I was thinking something like 72 hours (3 days), which would put the
> deadline at 1700 UTC Monday.
> 
> For simplicity's sake, I'm only sending this to the QA list now. If we
> have a general consensus on it being a good thing, we should
> distribute it more broadly before we adopt it.

I don't think I agree with this, but I *do* think I handled this badly
in the specific case we're referring to.

So, what happened here is:

* We had an accepted FE bug for a Bluetooth CVE (security) issue:
  https://bugzilla.redhat.com/show_bug.cgi?id=1888439

* A kernel update marked as fixing that bug was submitted on 2020-10-15
and submitted for stable later the same day. I submitted a stable push
request including that update the next day, and it was pushed. 2020-10-
16 was exactly a week before the Final Go/No-Go meeting, so this update
was in stable for a week before Go/No-Go:
  https://bodhi.fedoraproject.org/updates/FEDORA-2020-ce117eff51
  https://pagure.io/releng/issue/9725#comment-696530

* The kernel update first appeared in a compose validation event on
2020-10-19, only four days before Go/No-Go, because there was no
nightly validation event between 2020-10-16 and RC 1.2, which was built
on 2020-10-19.

* The update did not just fix the Bluetooth CVE: it was actually an
*entire kernel patch version update*, from 5.8.14 to 5.8.15.

There is actually a policy that updates to fix blocker and FE issues
should include the minimal change necessary to fix the bug. We (mainly
I) have gotten progressively laxer about enforcing that in the last few
years, to the point where I rarely actually bother with it at all
except in super egregious cases. This is because we've kinda got a
better track record of updates not breaking stuff, and we have much
better test coverage than we used to.

However, that was obviously the problem in this case. We (mainly I)
pulled in what was actually a fairly big change (new kernel release)
quite late in the process when we could have insisted on a smaller
change (just the CVE fix patch on top of 5.8.14), didn't include it in
a candidate compose until even later in the process, and didn't flag it
up as a significant change that needed testing. That's on me and I
apologize for it. (For the record, I didn't consciously consider this
and decide it wasn't a big deal; I actually just kinda blew through the
stable push request quickly, I don't recall why, and didn't *notice* we
were pulling in an entire kernel release bump. Obviously, part of that
is the above note that I've been getting generally laxer about checking
this; a few years back I used to rigorously check how much change was
in every single proposed blocker/FE update, nowadays I kinda...don't.)

So I don't think we really need a new rule here, we (mainly I) just
need to go back to being more careful about the policy we already have.
What should have happened here is I should have noticed we had an
entire kernel version update proposed as the fix for an FE and talked
to the kernel team about it. We could then either have decided to pull
in the update, but make sure it was properly tested, including flagging
it up to the ARM folks and making sure we had time to test it across
the important ARM platforms; or we could have decided to not pull in
5.8.15 and instead do 5.8.14 with a patch for the CVE. All of that is
what *should have happened under the current rules*, and I just whiffed
it. Again I'm sorry for that.
-- 
Adam Williamson
Fedora QA Community Monkey
IRC: adamw | Twitter: AdamW_Fedora | XMPP: adamw AT happyassassin . net
http://www.happyassassin.net
_______________________________________________
test mailing list -- test@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to test-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/test@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora Desktop]     [Fedora SELinux]     [Photo Sharing]     [Yosemite Forum]     [KDE Users]

  Powered by Linux