On 11/20/07, Warren Togami <wtogami@xxxxxxxxxx> wrote: > BUGS TO FIX TO MAKE IT ACTUALLY WORK > - wget --server-response http://PATH/TO/LARGE/FILE > It constantly redirects over and over again rather than waiting. > FILE.tmp never gets larger than 8192 or 16384 bytes. Hmm, my setup doesn't do that. What upstream server are you using, and which release of Fedora? I am running it on an FC6 machine, using http://linux.nssl.noaa.gov . > - We MUST find a way so the client can begin downloading the file while > the mirror downloads the file. Any other reverse proxy server would > allow this. This might be difficult to implement (perhaps impossible > with mod_python?) but this might be the only sane way to fix the > previous problem. I don't think this has anything to do with the problem above. This would be a nice enhancement, though. It does complicate the implementation, especially to handle byte range requests properly. > - Even if apache:apache owns DocumentRoot, SELinux denies by default. > Need sane solution. I haven't yet tested on a machine with SELinux enabled. I've attached an updated version of ApacheMirror.py with a redirection bug fixed; previously it omitted the http://server part of the URL, confusing some clients like Anaconda. I don't expect this to fix the problem you're seeing, though. --Ed
# This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Library General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. # # Copyright (c) 2007 Arastra, Inc. import mod_python, mod_python.util, urllib, os, shutil, time, calendar """ApacheMirror implements an automatically-populated mirror of static documents from an upstream server. It was originally developed for mirroring a Fedora Linux tree and should work for any simple directory tree of static files. When a document request arrives, ApacheMirror checks the last-modified time of the document at the upstream server. If the upstream copy is newer than the local copy, or a local copy does not exist, it downloads the document and stores it locally before serving it to the client. If the upstream copy cannot be found, either because it does not exist or because the server is unreachable, the request is served directly from the local mirror. Directory indexes are always requested from the upstream server, since they tend to change frequently. Superficially ApacheMirror behaves like mod_disk_cache combined with ProxyPass, except it maps the URL path directly to a local directory rather than hashing the URL. This allows the administrator to pre-populate portions of the mirrored tree quickly, for example from a DVD ISO acquired via BitTorrent. ApacheMirror makes certain assumptions about how the upstream server provides directory indexes, and does not deal with query strings (the part of the URL after the ?) at all. To configure a Fedora Linux mirror for a virtual host mirrors.sample.com, add something like this to httpd.conf: <VirtualHost *:80> ServerName mirrors.sample.com ServerName mirrors DocumentRoot /mirrors SetHandler mod_python PythonHandler ApacheMirror PythonDebug on PythonOption ApacheMirror.upstream http://download.fedora.redhat.com </VirtualHost> Ensure mod_python is installed and enabled, and that /mirrors is writable by the apache user. """ def handler(req): if req.uri.endswith("/index.html"): return mod_python.apache.DECLINED try: upstream = req.get_options()["ApacheMirror.upstream"] + req.uri o = urllib.urlopen(upstream) mtime = calendar.timegm(o.headers.getdate("Last-Modified") or time.gmtime()) isdir = o.url.endswith("/") except: return mod_python.apache.DECLINED local = req.document_root() + req.uri if isdir: if not req.uri.endswith("/"): mod_python.util.redirect(req, "http://" + req.server.server_hostname + req.uri + "/") local = os.path.join(local, "index.html") dir = os.path.dirname(local) if not os.path.exists(dir): os.makedirs(dir) if isdir: urllib.urlretrieve(upstream, local + ".tmp") if os.path.exists(local): os.unlink(local) os.rename(local + ".tmp", local) os.utime(local, (mtime,) * 2) req.internal_redirect(req.uri + "index.html") if os.path.exists(local) and os.stat(local).st_mtime >= mtime: return mod_python.apache.DECLINED req.err_headers_out["Location"] = "http://" + req.server.server_hostname + req.uri req.status = mod_python.apache.HTTP_MOVED_TEMPORARILY req.write("ApacheMirror retrieving %s" % upstream) # Keep feeding dots to the client so it doesn't time out while we # download a huge file from upstream urllib.urlretrieve(upstream, local + ".tmp", lambda *args: req.write(".")) if os.path.exists(local): os.unlink(local) os.rename(local + ".tmp", local) os.utime(local, (mtime,) * 2) req.write("done!\n") return mod_python.apache.OK
-- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list