Ed Swierk wrote:
Having tired of babysitting the rsync cron job that was keeping my
local Fedora mirror up-to-date, I tried the caching proxy approach
suggested at http://fedoraproject.org/wiki/Infrastructure/Mirroring/SiteLocalMirrors
for a few weeks. This, too, was unsatisfactory--I still want some
control over the mirrored content and the ability to pre-populate the
cache from a DVD ISO acquired via bittorrent when a new version of
Fedora is released.
ApacheMirror.py is a mod_python request handler that behaves like a
caching proxy, except it maps the URL path of a cached document
directly to a local directory rather than hashing the URL, this
preserving the mirror directory structure.
Just drop ApacheMirror.py into /usr/lib/python*/site-packages, set
your preferred upstream server and point it at a local directory on a
nice big disk, and forget it:
<VirtualHost *:80>
ServerName mirrors.sample.com
ServerName mirrors
DocumentRoot /mirrors
SetHandler mod_python
PythonHandler ApacheMirror
PythonDebug on
PythonOption ApacheMirror.upstream http://download.fedora.redhat.com
</VirtualHost>
The implementation is by no means bulletproof--consider this release
0.1--but it's worked well enough to serve local yum needs for the past
few days.
If there's interest, I could package up the script into an srpm (which
seems overkill for 50 lines of Python) or submit it as a patch to some
existing package.
--Ed
Excellent, I was hoping for something like this! I had played a bit
with both squid and varnish, but neither were fully satisfactory because
they can't easily store your cache in the original directory structure
without writing your own backend storage engine.
http://fedoraproject.org/wiki/Infrastructure/ProjectHosting/RequestingNewProject
Could you please create an "upstream" project for it at
hosted.fedoraproject.org? I think there are a number of improvements
that can be made.
I didn't read deeply into your code yet, but I imagine that it needs
improvement to handle unique synchronization and expiration issues that
yum repos and rawhide install trees create when file contents change
without changing filenames.
Perhaps a separate, asynchronous daemon can monitor upstream (via HTTP
or whatever) for repomd.xml changes. It should then parse the
repomd.xml so it knows when to expire the repodata/* files. Then it
should parse the .xml files in repodata/ to compare it to local storage,
and intelligently expire the packages if any changed (as happens during
signing). It can then know exactly which files to delete from the local
cache because they are no longer in the upstream. This daemon interacts
with ApacheMirror.py only in deleting files from the local directories,
effectively expiring the cache. Very simple.
That daemon could be configured to handle intelligent expiry of various
parts of the mirror tree in different ways. For example:
- development (rawhide) repo changes at least once per day. It also
contains install images (boot.iso, bootdisk.img, stage2, etc.) that need
to be expired every time the tree changes. (We might need to add a
hashes file to the mirror tree to allow the tool to monitor these changes.)
- Released distros never change, so don't need to monitor their
repomd.xml for changes.
Please create an upstream project at hosted.fedoraproject.org and let's
get started on this! Here you get to choose an project name for your
new "upstream" project. I personally would choose something like really
obvious like InstantMirror... but you get to choose.
The default definitions for mirroring download.fedoraproject.org could
be included in a Fedora/EPEL package that requires ApacheMirror.py and
the monitor/expiry daemon. That way a sysadmin who wants to create an
instant Fedora mirror need only install that package and enable it in
/etc/httpd/conf.d/. yum update handles pulling in updates for tree
changes (repo locations, how often to poll for repomd.xml changes, etc.)
Example:
yum install InstantMirror-fedora
vim /etc/httpd/conf.d/InstantMirror-fedora.conf
#(enable stuff)
service httpd restart
# http://fedora.localdomain.com
Instant Fedora mirror!
InstantMirror-fedora.noarch.rpm : instant Fedora mirror
InstantMirror-centos.noarch.rpm : instant CentOS mirror
InstantMirror-rpmfusion.noarch.rpm : instant RPMFusion mirror
InstantMirror-foo.noarch.rpm : instant Foo mirror
Warren Togami
wtogami@xxxxxxxxxx
p.s.
The same code could be used to create a public static-repos mirror.
static-repos changes many times per hour, probe for changes every 2
minutes. We need a few permanent public mirrors of this so people stop
hitting the koji server directly. Any public mirror interested in
hosting this?
p.p.s.
Another idea before I forget about it:
Later add configurable fallbacks to a different upstream if
download.fp.org is down. mirrors.kernel.org might be a good alternative
for default, for example.
--
fedora-devel-list mailing list
fedora-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-devel-list