Re: Proposal: changes to "default application functionality" release criteria

Kamil Paral <kparal@xxxxxxxxxx> · Wed, 4 May 2022 14:46:16 +0200

On Wed, May 4, 2022 at 12:13 AM Matthew Miller <mattdm@xxxxxxxxxxxxxxxxx> wrote:

Time to talk about

https://www.fedoraproject.org/wiki/Fedora_36_Final_Release_Criteria#Default_application_functionality

again!

Lots of desktop-app-related blockers this time around, and last time too. I

think we're hitting a symptom-of-our-success problem here: increasing

popularity and reviews noting how polished everything is makes us very much

want to build on that. So I understand why this is here, including the

expanded "all installed applications" Workstation criteria.

As Adam already noted, it is actually cut down. We used to have higher standards in this area (and we lowered them because we couldn't keep them, especially when KDE ships a bazillion of preinstalled apps).

So, ideas for discussion:

1. I know we have had GNOME Test days, but this

   stuff didn't come up. Presumably, it would have if someone had happened

   to look at GNOME Photos. Can we formally go through

   https://www.fedoraproject.org/wiki/QA:Testcase_desktop_app_basic and

   https://www.fedoraproject.org/wiki/QA:Testcase_desktop_app_basic_others

   much, much earlier (probably when the GNOME pre-release is available),

   either as part of a test day or some other formal thing? Or is there

   something else going on here that I'm not looking closely enough to see?

We could have a GNOME Apps Test Day, or perhaps a GNOME Low Profile Apps Test Day. Separating that from GNOME DE basics + settings etc would perhaps motivate people to focus more on those apps and spend more time with them.

2. "Basic functionality" seems scoped too broadly currently. I propose, for

   the release criteria, we change this to: A) "the app doesn't crash on

   launch", and B) "the app's behavior does not seem immediately

   embarrassing with a few minutes of playing around with it".

If you only want to block on app launch and close, let's be honest about it and call it that. The basic functionality requirement can stay for those high-profile apps listed explicitly in the criterion, and the other apps would only be required to launch and close without crashing.

However, that's a *massive* step down in quality. Do we really want that?

Your B) is too vague for me. If we ship with our current photos/contacts/etc bugs, I'll feel embarrassed.

If needed, we can try to define "basic functionality" clearer. For example:
a) We can say that the tested feature must be in line with the primary goal of the application. For Nautilus, that would probably be managing local files, but not connecting to remote filesystems, or a functional bookmarking system. For Photos, that would be local albums organization (and perhaps viewing remote ones), but not exporting photo thumbnails. For Cheese, that would be recording from your camera, but not applying effects.
b) We can say that functionality which is only available through app menus (as opposed to user facing buttons) is not basic.
c) We can say that bugs which only occur if you modify default app settings do not qualify.

We can make many clarifications like these, and perhaps it would help us sometimes to decide our arguments. At the same time it can also burn us. And we'd have to decide whether we want to apply the same standards for both high-profile apps listed in the criterion and also low-profile (all the rest) apps, or if we want to have different standards.

But, if we keep at least some reduced "basic functionality" requirement for those low-profile apps, I don't think that would help the current situation. A photo organizer which can't organize photos doesn't meet that criterion, whichever way you look at it. A contacts app which duplicates contacts on edit, crashes when you add a new contact quickly, and fails to delete contacts more often than not... doesn't block the release already (however weird that sounds), so again no change.

So as I see it, we can update the basic functionality description, but as long as it affects low-profile apps, these situations will keep happening. Or we remove it completely (at least for low-profile apps), but have a massive fall in quality.

3. Problems found which are not regressions should not be blockers. We're

   just hurting ourselves when we make this our forcing function to get

   something fixed.

Such a broad rule is a really bad idea. You're probably thinking "this bug was already there, and very few people complained, so why block our next release on it?". Yes, if we include a safeguard "and very few people complained", the proposal starts to sound more reasonable. But imagine that there was a massive disaster in our last release - e.g. a Nautilus data-corruption bug slipped through our fingers, or Anaconda ate hard drives in certain cases, stuff like that. We only found out after the release, when we received a flood of angry bug reports of people leaving Fedora for good, and by your definition... we are **prevented** from blocking our next release on that. That's surely not a good idea?

I think I understand where you're going with this, but it can't be that broad.

   I propose that the teams responsible for blocking desktop deliverables

   keep their own prioritized lists of this kind of problem that the team

   agrees should be fixed for a good user experience. Not just add to the

   general queues of bugs or tickets, but specific lists of "application

   experience issues".

   These lists, of course, could include problems which also don't qualify

   for point #2 but which seem important. Like the Photos app issues.

   (This could also extend to "desktop experience issues" rather than just

   "application". Or for that matter there could be a similar mechanism for

   non-desktop blocking deliverables.)

   I could be convinced either way on having these in the teams' issue

   trackers in pagure or whereever _or_ having it as more targets in the

   blockerbugs app. I tend towards the latter: I think it might help with

   the problem where blockers feel like the only obvious way to bugs tracked

   and fixed. But either way, there should be lists!

Well, if there are lists which those teams actually follow and maintain, that would of course be a good thing. Bugzilla tracker would be the easiest implementation. I'm a bit worried that those teams will get flooded with an avalanche of reports, but that's not really my problem to solve. I'm more than happy to tag e.g. Workstation bugs against the Workstation bugzilla tracker, to make them aware.

4. Desktop application problems discovered during at the last minute should

   not be blockers. 

We already have a rule for last minute blockers, and it applies to everything, not just desktop:
https://www.fedoraproject.org/wiki/QA:SOP_blocker_bug_process#Exceptional_cases
I don't understand this sentence.

If the problem is really going to impact a lot of

   people, it should have been discovered in the beta. 

But that's the point. The Photos bug is unlikely to affect many people, because its usage is probably very low. And that's why it wasn't discovered before. Because testers don't use it regularly, and if you spend one minute with it, it might seem "OK" to you, depending on which buttons you click. I wrote my thoughts about this problem in detail here:
https://pagure.io/fedora-workstation/issue/304#comment-795150

At the same time, I disagree with the intention of this change. Only Beta-related things get proper testing around Beta, because it's *Beta*, and therefore we focus on *Beta* stuff. Even then, we often miss Beta-blocking bugs and discover them before Final. We can't test (not deeply, and sometimes at all) Final-related stuff, because we simply don't have time for it during Beta. Not to mention that Beta-related bugs sometimes preclude actually testing Final-related things. We can only start properly testing Final once Beta is out. Also, let's not forget that GNOME completely changes around Beta with a new major update. I simply don't see how this "should have been discovered in the beta" could happen in the real world.

   (Exception for _new_ regressions, of course.)   
   I don't know how to phrase this in a way that doesn't make Adam sad with

   me, but maybe something like: Desktop application blockers discovered

   during the final freeze are automatically waived 

I'm sad as well :sad panda:. This would effectively say "we don't care about 
Desktop". We would ship it horribly broken. Again, if Nautilus eats your
 documents, it's not a blocker, just because it was discovered after 
Beta. Why such a broad statement? And why do you single out desktop apps in particular? And why all of them instead of some subset, like the low-profile ones?

And now a bit
 more technical. How do you want to decide if the bug was present before
 Beta, but discovered after Beta, or if it appeared after Beta? The 
nightly composes are only stored on our servers for 2 weeks (!). I 
already hit this issue this cycle when I could technically figure out 
when a certain regression started, but I wasn't able to, because the 
older composes have been already wiped.
Another note, this 
"proof delivery" will double the load on QA, because not just 
we need to find the bug, we would now also need to prove that it wasn't 
there before a certain compose. For every proposed blocker.

unless the relevant Spin

   or Edition team decides otherwise.

All the power to the working groups. I'd happily let them decide about all bugs related to their Edition, and maintain their own release criteria and everything, because that would be way less work for QA :-) But I don't suppose they'd jump in joy about this.

So unless they really want all this responsibility, perhaps it could work the way around - accepted blockers could be waived by the decision of the relevant team. It's their product after all. Of course, there should be some systematic approach, so that we don't waste much time discussing blockers which are then getting waived. And if there are many waivers in a certain area, the related criteria should be adjusted to reflect that. But in general, I think this approach could work fine.

5. Okay, and... bigger: we should aim for more approaches which let us

   decouple as much as possible from the Release. (My grand hope is that we

   can release every deliverable on its own schedule, but I also understand

   the _highly aspirational_ nature of that idea. But...) What if we could

   just easily ship GNOME Photos from GNOME 41 until a fix is found in the

   updated one?

The answer could be Flatpak. And honestly in our criteria there's nothing preventing Workstation WG doing just that right now. But technically we're not ready yet, I guess. If this can be done with RPMs as well, let's go ahead. Ubuntu does that all the time.

After reading all these proposed changes, I have to ask - what is the main motive for proposing them? Is it to release F36 already? Is it to prevent future Fedoras from delaying as much as F36? We used to be OK with release slips. Is desktop importance lower than before? Is it the frustration from finding trivial bugs in trivial apps so close to the final release? Something else?

Because depending on what we want to achieve, perhaps there is a better way. With the current proposed changes, I believe that the end result would be less-delayed Fedora with more broken desktop apps. The reason for introducing the basic functionality criterion in the past was, iirc, a bad PR we were given in reviews when desktop apps often broke quickly after the reviewer tried to use them. It seems we'd be heading back in that direction with this proposal.

Instead of shipping broken apps, what if we had a conversation about which is better - shipping broken apps or not shipping those apps at all? And I mean this question honestly. Is it better to ship something we *know* it's in a broken state, and hopefully issue an update later, or is it better to yank it from the default install (provided it's not crucial for the desktop)? Or delay the release? I'd start with defining our priorities in this way. Is it unthinkable to have a plan like "apps from group X can only delay the release for at most Y weeks, otherwise we'll not ship them by default"? It would also make us re-evaluate whether we really need to ship everything we currently have, including unmaintained apps without any developers, or half-baked apps with just a slight community maintenance, etc. I don't mean to be derogatory to some of those gnome apps. But if those apps are problematic, is the right approach to lower the quality bar for all apps included, or should we rather make some adjustments just for the problematic set?

I understand that the Workstation team wants to have some basic functionality set present on the desktop, and that this is a painful topic for them. And perhaps shipping those apps broken will be decided to be the best option. But it seems we're discussing something completely different here instead.

(I'd also be glad if we could put the toolkit and desktop environment wars behind us, and simply ship the best in class app (let's say the best photo organizer available) with our desktop, whatever the toolkit. It would have an existing userbase, more maintenance and  QA, and GNOME folks could focus on great integration (looking close to native, online accounts integration, etc) instead of writing everything from scratch. That would also avoid some of the issues we see. But I don't believe that will happen any time soon).

Kamil
QA

_______________________________________________
desktop mailing list -- desktop@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to desktop-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/desktop@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure