Re: Fedora Updates System

Michael Schwendt <bugs.michael@xxxxxxx> · Fri, 24 Nov 2006 11:09:58 +0100

On Sun, 12 Nov 2006 16:42:20 -0500, Luke Macken wrote:

> [0]: http://cvs.fedoraproject.org/viewcvs/fedora-updates-system/?root=fedora
> [1]: http://fedoraproject.org/wiki/Infrastructure/UpdatesSystem

I'd like to take the opportunity for a few comments in no particular
order:

repoview
--------

Running it for all of Fedora Extras takes a lot of time. So long, that
even if you let the push script do its work in a background terminal, it
is still painful to see how long it takes to complete.

For unknown reasons we also create repoview pages for the "debug"
repositories. If it were just my own decision, I would stop doing that,
since I doubt those web pages are popular enough. Who really browses
repoview for debuginfo packages? We should expect debuginfo packages to be
available for every relevant package in the repository.

createrepo < 0.4.5 is unable to handle "unknown" files in its repodata
directory. Therefore it conflicts with repoview and runs into a fatal
error condition with a premature program termination, leaving behind a
temporary ".olddir". This is especially ugly, since an administrator would
need to recover from that manually and either move back files from
".olddir" or delete it. But when deleted, the repoview tree is lost and is
created from scratch. I've been told that this is a problem for mirrors,
where the several thousands of files are re-examined for changes just
because of the fresh time stamps. So, as of a few weeks ago, the push
script works around that successfully with a repodata backup strategy
outside of createrepo.

For createrepo and repoview to be run in the background as a scheduled
job, it needs a local lock on the repository. Most likely _not_
fine-grained locking on every arch-specific sub-repository, because every
locking comes at a cost (especially when there are multiple jobs of
different priority waiting).

repoclosure
-----------

Running this takes even more time. Currently, it examine all of Fedora
Extras + Core + Updates + Legacy in a background job after packages have
been published. The time it takes is approximately the difference between
the time stamps of the build report and the broken deps report.

[...]

> ''' Pushing '''
>  * Moves packages to proper updates stage

More "stages" which are understood by plague would be good. We only use
one stage, needsign, which is the build-results repository known to the
build servers.

Fedora Extras had started with a small collection of sh/py scripts for
signing and moving rpms from plague's build-results directory into the
repository. Among the reoccurring problems, which lead to some of the
development on the push script(s):

 - pulling away built rpms from under plague's feet

Initially (long ago) we signed rpms directly in the needsign repository
and moved the packages into the local master repository. Due to that, they
became unavailable to the build servers until they appeared on the public
master repository. Particularly trouble-some, since we push to RDU, and
they are synced from there to RH, which is not an immediate operation.

 - permission problems

Even with a shared gid and umask, there are remaining problems, such as
explicit directory mode 0755 in yum backend code or Python modules.

 - disk space constraints

No longer an issue since the larger hdds were installed. But it required
working with temporary directories for signing and publishing packages in
order to avoid breakage in the middle of push. It's not trivial to recover
from that without the help of a database or transaction state information.

>  * updates repo cleaner
>  * remove old packages

So far, the script I named repoprune is much faster than repomanage and
simplifies Fedora Extras repository maintenance a lot, since it gets rid
of orphaned and out-of-date sub-packages automatically, too.