On Sun, 12 Nov 2006 16:42:20 -0500, Luke Macken wrote: > [0]: http://cvs.fedoraproject.org/viewcvs/fedora-updates-system/?root=fedora > [1]: http://fedoraproject.org/wiki/Infrastructure/UpdatesSystem I'd like to take the opportunity for a few comments in no particular order: repoview -------- Running it for all of Fedora Extras takes a lot of time. So long, that even if you let the push script do its work in a background terminal, it is still painful to see how long it takes to complete. For unknown reasons we also create repoview pages for the "debug" repositories. If it were just my own decision, I would stop doing that, since I doubt those web pages are popular enough. Who really browses repoview for debuginfo packages? We should expect debuginfo packages to be available for every relevant package in the repository. createrepo < 0.4.5 is unable to handle "unknown" files in its repodata directory. Therefore it conflicts with repoview and runs into a fatal error condition with a premature program termination, leaving behind a temporary ".olddir". This is especially ugly, since an administrator would need to recover from that manually and either move back files from ".olddir" or delete it. But when deleted, the repoview tree is lost and is created from scratch. I've been told that this is a problem for mirrors, where the several thousands of files are re-examined for changes just because of the fresh time stamps. So, as of a few weeks ago, the push script works around that successfully with a repodata backup strategy outside of createrepo. For createrepo and repoview to be run in the background as a scheduled job, it needs a local lock on the repository. Most likely _not_ fine-grained locking on every arch-specific sub-repository, because every locking comes at a cost (especially when there are multiple jobs of different priority waiting). repoclosure ----------- Running this takes even more time. Currently, it examine all of Fedora Extras + Core + Updates + Legacy in a background job after packages have been published. The time it takes is approximately the difference between the time stamps of the build report and the broken deps report. [...] > ''' Pushing ''' > * Moves packages to proper updates stage More "stages" which are understood by plague would be good. We only use one stage, needsign, which is the build-results repository known to the build servers. Fedora Extras had started with a small collection of sh/py scripts for signing and moving rpms from plague's build-results directory into the repository. Among the reoccurring problems, which lead to some of the development on the push script(s): - pulling away built rpms from under plague's feet Initially (long ago) we signed rpms directly in the needsign repository and moved the packages into the local master repository. Due to that, they became unavailable to the build servers until they appeared on the public master repository. Particularly trouble-some, since we push to RDU, and they are synced from there to RH, which is not an immediate operation. - permission problems Even with a shared gid and umask, there are remaining problems, such as explicit directory mode 0755 in yum backend code or Python modules. - disk space constraints No longer an issue since the larger hdds were installed. But it required working with temporary directories for signing and publishing packages in order to avoid breakage in the middle of push. It's not trivial to recover from that without the help of a database or transaction state information. > * updates repo cleaner > * remove old packages So far, the script I named repoprune is much faster than repomanage and simplifies Fedora Extras repository maintenance a lot, since it gets rid of orphaned and out-of-date sub-packages automatically, too.