Re: A proposal for Fedora updates

Kevin Fenzi <kevin@xxxxxxxxx> · Fri, 27 Mar 2015 06:49:27 -0600

On Thu, 26 Mar 2015 23:55:31 +0000 (UTC)
Bojan Smojver <bojan@xxxxxxxxxxxxx> wrote:

> M. Edward (Ed) Borasky <znmeb <at> znmeb.net> writes:
> 
> > As a bleeding-edge user I'd be in favor of this, although I thought
> > that was what 'updates-testing' was.
> 
> Maybe I'm misunderstanding how things work, but I think every package
> in updates-testing is signed by a human, on an "offline" machine
> (i.e. someone has to walk the RPM to it using physical media, sign it
> and then bring it back and upload it), which may be causing some of
> these delays. So, I was thinking of a more relaxed signing key, which
> would used directly by the build system after people build the
> packages. Virus and malware scanning at this point would be useful,
> of course, but would not catch everything - that's for sure.
> 
> PS. Apologies if the above is misinformation. Going from memory here,
> from the days of that Fedora compromise a few years ago.

This is indeed not really correct. ;) 

Let me go over the current process and what takes time: 

* releng person gathers list of pending update requests from bodhi.
  (a few minutes)

* releng person looks over list for anything out of the ordinary or
  off. (another few minutes)

* releng person tells sigul to sign that list of packages and write out
  the signed ones in koji. The releng person talks to the sigul bridge
  and the sigul vault (which is not reachable via ssh) talks to the
  bridge.

  The time here varies a lot. On a "normal" day it wouldn't take that
  long. If there are updates to some packages with tons of subpackages
  (like texlive that has 5000 or so, root with 1000s, or a kde or gnome
  'mega' update with 1000 or so) it will take longer. One big
  bottleneck here is that sigul has a 'batch mode' that allows you to
  tell it to sign a bunch of rpms at once, but it locks up often, so we
  can't really use it. Which means we need to run it one at a time mode
  where it makes a connection, does a bunch of handshaking, signs 1 rpm
  and tears it all down. I recently looked and this 1 at a time mode
  gets us about 20rpms/minute or so (of course this also varies when
  you have a rpm thats like texlive src.rpm or webkitgtk4-debuginfo or
  0ad thats gigantic it takes a while to transfer the rpm around).
  Batch mode would get us around 100/minute or more. We have some
  Google Summer Of Code proposals to work on sigul and hopefully solve
  this lockup issue. So, this step could take a hour or several hours. 

* Once all packages are signed, releng person again asks bodhi for a
  list of pending updates, and of course since our updates flow is set
  to 11, there's a bunch more now. So, repeat the first 3 steps above
  until you manage to get everything signed and ready. Usually this
  repeat gets smaller and smaller as you catch up to new submissions.
  So, usually takes just 15min or something. (unless there's texlive or
  the like).

* The push is started in bodhi. When we are not in freeze we typically
  just do all the branches at once. This allows for less babysitting
  time needed and ensures a common blob of packages go out (ie, if we
  only did f22 then f21, there could be a newer update in f21 submitted
  after we did f22 and cause update path issues, etc). (a few minutes)
  So, currently thats f22, f21, f20. 

* bodhi fires off (one at a time) mashes for each of the repos. This is
  the step that takes by far the most time and wouldn't really be
  helped by having another repo. mash in turn: gathers all the
  packages from koji, checks signatures, makes drpms, does multilib for
  x86_64, etc. 

  Part of the problem here is people don't realize how vast our pile of
  updates is. Just in terms of disk: 

141G    /mnt/koji/mash/updates/f20-updates-150326.1141/f20-updates
35G     /mnt/koji/mash/updates/f20-updates-testing-150325.1649/f20-updates-testing
87G     /mnt/koji/mash/updates/f21-updates-150324.2207/f21-updates
33G     /mnt/koji/mash/updates/f21-updates-testing-150325.1806/f21-updates-testing
60G     /mnt/koji/mash/updates/f22-updates-testing-150325.1444/f22-updates-testing

  In terms of files: 

/mnt/koji/mash/updates/f20-updates-150326.1141/f20-updates: 159131
/mnt/koji/mash/updates/f20-updates-testing-150325.1649/f20-updates-testing:58444
/mnt/koji/mash/updates/f21-updates-150324.2207/f21-updates:65604
/mnt/koji/mash/updates/f21-updates-testing-150325.1806/f21-updates-testing:15223
/mnt/koji/mash/updates/f22-updates-testing-150325.1444/f22-updates-testing:33378

  The things that would help us here would be: 

  bodhi2 finally landing, as it has support for doing multithreaded
  mashing and could thus fire off mashes for all the repos at once (or
  some subset of them as needed), which would give us parallel instead
  of serial. 

  We landed support in rawhide/f22 composes for doing multithreading on
  drpm creation. I'm not sure thats enabled for f20/f21 or how hard it
  would be to do so in bodhi1, but that might be a win if we can do it. 

  Less updates or no drpms or no multilib would also save us a lot of
  time here. 

  (This step (mashing repos) takes many many hours, and is by far the
  bulk of the process). 

* Once done mashing all the repos, bodhi does a bunch of checks,
  inserts information about security updates and such into the
  repodata, and then the updates are synced to the master mirror. (Takes
  30min or so). 

  What would help us here: bodhi2 has fedmsg support and we can then
  sync to master mirrors on a fedmsg instead of a cron. This would
  shave off 15min or so here I suspect. 

* Once updates are on the master mirror bodhi updates bugzilla tickets
  and requests in it's interface that things are done. (just a few
  minutes). I suppose if bugzilla was faster here it could help us. 

* Mirrormanager then notices that the repos have changed and generates
  new repo information and pushes that out to the mirrorlist servers.
  This could take an hour or so. 

  Things that would help here: Get mirrormanager2 in production so we
  can start adjusting behavior, add fedmsg triggers on repo pushes,
  etc. 

So, IMHO, another repo wouldn't help us here. Perhaps it would save
time on the signing, but it wouldn't on the mashing step, and it would
add to confusion and things we need to make and care about. I'd much
rather try and land all the improvements above and make things faster. 

I'll note that we have discussed a lot having a 'urgent updates' repo
that might bypass this process, but we wanted that to be restricted to
urgent security updates only and there's still discussion about how we
want to do it and make it any faster (possibly no drpms, no
mirrormanager (always points to master), no bodhi (just made
manually), etc etc).

kevin
Attachment:
pgpDVUZVGFCYH.pgp

Description: OpenPGP digital signature
-- 
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/devel
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct