Re: Proposal: changes to "default application functionality" release criteria

Adam Williamson <adamwill@xxxxxxxxxxxxxxxxx> · Tue, 03 May 2022 17:14:53 -0700

On Tue, 2022-05-03 at 18:13 -0400, Matthew Miller wrote:
> Time to talk about
> https://www.fedoraproject.org/wiki/Fedora_36_Final_Release_Criteria#Default_application_functionality
> again!
> 
> Lots of desktop-app-related blockers this time around, and last time too. I
> think we're hitting a symptom-of-our-success problem here: increasing
> popularity and reviews noting how polished everything is makes us very much
> want to build on that. So I understand why this is here, including the
> expanded "all installed applications" Workstation criteria.

To be clear, it's not exactly that the Workstation x86_64 requirement
is expanded, but that the other requirements are reduced. Up until a
couple of years ago, the requirement was "all installed applications"
for all release-blocking desktops, on all release-blocking arches. We
narrowed it down to being "all installed applications" for Workstation
on x86_64, and just the specified list of apps for other cases (KDE on
any arch, Workstation on aarch64).

> But I think we might be using the wrong tool for some of this polish, and I
> think we also need to give ourselves some escape hatches.
> 
> By way of concrete example, the Photos application is meant to be a photo
> organizer, so "album picker duplicates fields, preventing photo
> organization" https://bugzilla.redhat.com/show_bug.cgi?id=2081291 is easy to
> classify as failing the basic functionality test currently. But, it's really
> painful for that to be blocking the release.
> 
> 
> So, ideas for discussion:
> 
> 
> 1. I know we have had GNOME Test days, but this
>    stuff didn't come up. Presumably, it would have if someone had happened
>    to look at GNOME Photos. Can we formally go through
>    https://www.fedoraproject.org/wiki/QA:Testcase_desktop_app_basic and
>    https://www.fedoraproject.org/wiki/QA:Testcase_desktop_app_basic_others
>    much, much earlier (probably when the GNOME pre-release is available),
>    either as part of a test day or some other formal thing? Or is there
>    something else going on here that I'm not looking closely enough to see?

Well, kinda, yes. The thing going on is that we *did* go through that
test at several earlier points:

https://openqa.fedoraproject.org/testcase_stats/36/Desktop/QA_Testcase_desktop_app_basic_others_Release_blocking_desktops___lt_b_gt_x86___x86_64_lt__b_gt_.html

but these bugs weren't discovered at that time. This is likely because
of your idea 2: "basic functionality" is a bit up for debate. You can
take an extremely minimalist approach to this (run the app, click a
couple of buttons, say it's OK if nothing explodes and no babies are
eaten) or a slightly more maximalist one (run the app, and actually try
and do something useful with it). In this case, before Final, when we
ran this test we mostly did the minimalist thing. At Final RC stage, we
went a bit more maximalist.

> 
>    (Same for KDE and any other theoretical release-blocking desktops!)
> 
>    
> 2. "Basic functionality" seems scoped too broadly currently. I propose, for
>    the release criteria, we change this to: A) "the app doesn't crash on
>    launch", and B) "the app's behavior does not seem immediately
>    embarrassing with a few minutes of playing around with it".
>    
>    As a barometer for "embarrasing", you can imagine me trying to explain
>    the issue to a tech reporter, and weigh how awkward I will feel saying
>    "this is fine" vs. how I will feel explaining that we delayed the whole
>    release for that same issue.

I agree with the sentiment but I'm not sure about the phrasing. It's
extremely subjective, and I don't think subjective criteria work very
well. It also wouldn't necessarily "solve the problem": the bugs we
discovered this time really are pretty embarrassing, honestly. Imagine
doing a keynote showing off the sleek default apps included in GNOME,
running them, and trying to do...well...actually anything at all useful
with them. It wouldn't go very well.
> 
> 3. Problems found which are not regressions should not be blockers. We're
>    just hurting ourselves when we make this our forcing function to get
>    something fixed.

This is one of those things that sounds great until it doesn't. I can't
quite recall any specifics, but I definitely think there have been
cases recently where we've had strong support for a bug that is not a
regression to be a blocker. Sometimes a bug is just really bad but we
didn't see it before; even if you can make a wonk-y argument that
there's no point making it a blocker because if we do, it just means
the previous release stays as the "current" one for longer and it has
the same bug, in practice it's hard to hold that line when now there's
a bug report that everyone can see that says how badly this thing is
broken.

It also, again, wouldn't have solved this problem, because most of
these bugs *are* regressions, IIRC. The Photos bugs weren't in F35, it
worked better there.

>    I propose that the teams responsible for blocking desktop deliverables
>    keep their own prioritized lists of this kind of problem that the team
>    agrees should be fixed for a good user experience. Not just add to the
>    general queues of bugs or tickets, but specific lists of "application
>    experience issues".

I am just gonna leave this here for context, and [snip] to:
> 
> 4. Desktop application problems discovered during at the last minute should
>    not be blockers. If the problem is really going to impact a lot of
>    people, it should have been discovered in the beta. (Exception for _new_
>    regressions, of course.) By the time we're in final freeze, this ends up
>    being hero work for everyone.
>    
>    I don't know how to phrase this in a way that doesn't make Adam sad with
>    me, but maybe something like: Desktop application blockers discovered
>    during the final freeze are automatically waived unless the relevant Spin
>    or Edition team decides otherwise.

You're right that this makes me sad. I don't think it's a good
approach. I think it's an attempt to solve a problem that I would maybe
look at differently, and which we're currently discussing in a ticket:

https://pagure.io/fedora-workstation/issue/304

For me, the big question your mail never quite arrived at is, *why* did
these bugs show up in Fedora 36 Final RCs at all? They really should
not have done. They are bugs in applications that are, supposedly, core
parts of upstream GNOME. They appear in 42.0 releases of those
applications - i.e. in releases of those applications that are,
according to upstream's versioning scheme, stable releases for public
consumption.

Stable releases of core components of a major desktop should never
contain bugs like "deleting contacts sometimes doesn't work" or "you
can't add photos to an album in the Photos application because the
dialog where you're supposed to do it is completely broken and the list
entries multiply like rabbits who've been dosed up on viagra".
Distribution validation testing is not *for* finding bugs like this.

The reason we're all instinctively feeling that something is Not Right
here is that something *is* Not Right, but the big thing that's Not
Right is the upstream GNOME release process. It's not (IMHO) any part
of the Fedora process - there are things we could tighten up there, but
I see those as subsidiary problems.

It's Not Right that GNOME can ship a 42.0 release containing entirely
broken applications. That should not be happening. It's not something
we should have to design our Fedora distribution validation testing
process to fix.

The criterion and the test case were written with the unspoken, but
IMHO reasonable, assumption that in general we could trust desktops to
provide us with more-or-less-working software. They were never written
with the intent of finding this kind of bug. The scenario I was
envisioning when we wrote them was "oh, we accidentally packaged a
broken development version of app X" or "we're still including random
third-party app Y in the Workstation edition but it's not been
maintained for years and doesn't work any more, let's throw it out". I
was never envisaging having to deal with "GNOME shipped us completely
broken applications in a stable release". I don't think our goal should
be to design a release validation process that deals with that, because
*that shouldn't happen*.

> 5. Okay, and... bigger: we should aim for more approaches which let us
>    decouple as much as possible from the Release. (My grand hope is that we
>    can release every deliverable on its own schedule, but I also understand
>    the _highly aspirational_ nature of that idea. But...) What if we could
>    just easily ship GNOME Photos from GNOME 41 until a fix is found in the
>    updated one?

I mean, we maybe could. I dunno if we tried that yet. There's not
necessarily anything in the current rules/policies that precludes this,
AFAIR.
-- 
Adam Williamson
Fedora QA
IRC: adamw | Twitter: adamw_ha
https://www.happyassassin.net

_______________________________________________
desktop mailing list -- desktop@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to desktop-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/desktop@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure