Re: Fw: Re: OUTAGE: Koji system 2019-01-11 -> 2019-01-14

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 13, 2019 at 5:42 AM Dan Horák <dan@xxxxxxxx> wrote:
>
> On Sun, 13 Jan 2019 11:05:33 +0100
> Miro Hrončok <mhroncok@xxxxxxxxxx> wrote:
>
> > On 12. 01. 19 19:47, Kevin Fenzi wrote:
> > > On 1/12/19 5:59 AM, Miro Hrončok wrote:
> > >> On 12. 01. 19 14:19, Kevin Kofler wrote:
> > >>> Dan Horák wrote:
> > >>>> This is a reminder that this begins this Friday and will
> > >>>> probably be in place for 4 days. If there are increased
> > >>>> downtimes, we will update the data when we know it.
> > >>>
> > >>> 4 whole DAYS of Koji outage are absolutely unacceptable!
> > >>>
> > >>> This just shows how bad an idea it was to add those exotic
> > >>> secondary architectures to the primary Koji and to make them
> > >>> blocking for all builds.
> > >>> We really need to go back to building secondary architectures on
> > >>> secondary
> > >>> Koji instances, so that they do not hold the entire Fedora
> > >>> hostage for days!
> > >>
> > >> While I don't necessarily agree with the tone or wording, Kevin
> > >> has a point here.
> > >
> > > I don't agree with the tone, wording or point. :)
> >
> > Ah, it seems my e-mail was a little pointless without providing more
> > thoughts. Sorry about that.
> >
> > > When there were secondary arch koji's it took a number of people
> > > full time to nurse along koji shadow. Often things would land in
> > > primary, secondary would find a bug later and there would need to
> > > be a lot of coordination and back and forth before things were
> > > fixed in both places.
> > >
> > > Now all those folks can concentrate on the bugs as they happen in
> > > primary koji, fix them faster and everyone wins.
> >
> > I agree with this. That clearly reduces a lot of work that would
> > otherwise be carried by a group of alternate arch fighters. Also, the
> > situation is more easily to read etc.
> >
> > No doubt that having everything in one Koji is beneficial.
> >
> > > If there were constant outages you might have more of a point, but
> > > this is the only multi-day one I can think of, it's over a weekend,
> > > noarch packages can keep building and if there's a security or
> > > urgent issue we can just Exclude s390x until it's back up.
> >
> > It's however not just about outages. What bothers me about s390x is:
> >
> >   * From time to time there are situations where the build waits for
> > a s390x builder.
>
> as they wait for other arch builders too
>

This is actually a really bad situation. If it weren't for the fact
Koji tags all resources in and regenerates the internal repos with all
the arches at once, I'd suggest we should consider making it possible
for each arch to independently tag in and merge into the buildroot
repos. There are a number of problems with this suggestion based on
how Koji currently works today.

It's regrettable that it was so much more work to maintain mostly
unusable architectures in shadow Koji instances, because the direct
exposure has mostly had a negative effect on the packager experience,
since most packagers can't do anything with the architecture and have
no knowledge for fixing it.

> >   * Upstreams don't have access to a s390x CI service and are often
> > unable to debug s390x problems without (very slow) virtualization.
>
> we are aware of the problem, for all non-x86 arches, but there is no
> simple solution
>

Well, strictly speaking, this is really only a problem for IBM
architectures (ppc64le and s390x) because there's no economical way
for anyone to be able to care for them. For both ARM architectures we
support, there are a number of low-cost (in the impulse buy range,
even!) systems that people can acquire to do local development. For
the upcoming RISC-V port, it's practically guaranteed that we're going
to see similar low-cost hardware so that people can be exposed to the
platform develop relatively soon after the Fedora RISC-V port is
mainlined.

The "simple" solution is to develop some way for affordable access to
the equipment so that it can be cared for by more people. That's
probably a job for IBM and OpenPOWER, but it's something _somebody_
needs to figure out.

> >   * With ppc64 removal, s390x is the only Big Endian arch in the pool
> > (while ppc64 was easier to get to via CentOS CI).
> >
> > I just think that we are burning a lot of resources without a clear
> > visible benefit. Don't get me wrong, I am no architectures expert and
> > I don't intent to pretend I am. It's just that all the s390x bugs
> > I've seen for packages I happen to help taking care of are from other
> > Fedora maintainers who fight FTBFS for their own packages etc., never
> > from any actual users.
> >
> > s390x just feels to me like "that weird thing that often breaks and
> > nobody really cares about". I realize this view is very narrow and is
> > mostly based on feelings rather than facts, however it just how it
> > feels.
> >
> > Hence I consider it unfortunate that s390x can block the whole
> > distro, even if it's just for a couple days.
>
> is it really such a big problem, during a weekend?
>

Last I heard from Matthew Miller, less than 30% of Fedora contributors
are employed by Red Hat. The remaining are (more or less) volunteers.

Speaking as one of those said volunteers, the weekend is literally my
time to take a crack at Fedora stuff. I try to squeeze some of it
during the weekdays, but I'm rarely successful enough there.

> > I'm not saying all this to start a flame or to blame one poor
> > architecture, I'd genuinely love to know the answers to:
> >
> >   * Where can upstreams get a nice CI as a service for s390x? [1]
>
> nowhere yet, but how it is different from other non-x86 arches?
> Upstream can run their CI on our infrastructure, a number of
> them already does.
>
> BTW is there any open-source "CI as a service" software that could be
> just deployed (on any arch)?
>

* Buildbot (Python 3)
* Jenkins (Java)
* Vespene (Python 3)
* GoCD (Java + Ruby)
* Zuul (Python)

Of the four listed above, only the first two are packaged in Fedora.
Buildbot is up-to-date in Fedora, but Jenkins lags behind considerably
and needs love.

I would be rather pleased if we deployed either Buildbot or Vespene,
as these are services that Fedorans can easily hack on, due to their
similarity to other parts of our infrastructure. I also personally
co-maintain Buildbot and try to keep it up to date.

CentOS CI (Jenkins) is unfortunately a terrible service that's hard to
use. It's not first-class anywhere, and integrating it into projects
is painful. The CentOS CI pipeline stuff requires programming
knowledge using Groovy due to the Jenkinsfile format being a program.
It's also frustrating to debug, based on my experience with it in the
Pagure project.

If CentOS CI is going to be a successful venture, it's going to need a
lot of work to become more usable, useful, and accessible.

> >   * What are the users of Linux on z and what packages/apps/tools are
> > they interested about? Do they run desktop environments, should I
> > e.g. bother fixing a bug in a desktop GUI application that controls a
> > 3D printer? Or is it just OK to Exclude s390x on such apps? etc.
>
> they probably won't control 3D printers, but the overall goal is to
> make s390x looks as any other arch, thus we have for example
> virtio-gpu, so for a user there won't be a difference when running
> their desktop environment. In general having an heterogeneous
> environment is useful because it increases the overall quality pointing
> to buggy user code, toolchain bugs, etc.
>

Perhaps you should try more to have System/z be able to *be* like
regular computers if you really want that. Maybe have some OpenQA
stuff wired in so that it can use it to see how well Fedora on
System/z works, since mainframes don't have accelerators or normal
peripheral ports on the physical hardware normally and make up for it
by obscene amounts of computing power.

> >   * Where do we actually gain users of Fedora/s390x? When I search
> > for Linux s309x, I get to our Wiki (after several Ubuntu and IBM
> > links) [2], but that page is probably not very user targeted and
> > seems a bit outdated.
>
> The users are primarily interested in the enterprise products for
> s390x, but they are well aware that Fedora shows them what will appear
> in the next version of enterprise products. Or they simply need to
> run newer stuff than enterprise version provide.
>
> And yes, our wiki would appreciate more love :-)
>

Be the change! Also, talk to CommOps about figuring out how to improve it!


-- 
真実はいつも一つ!/ Always, there's only one truth!
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux