On Fri, Mar 16, 2007 at 10:01:57PM +0100, Thomas M Steenholdt wrote: > Michael E Brown wrote: > > > >Matt Domsch is working on just such a tool and is looking to have it in > >place for F7 release, afaik. The tool is Mirror Manager. > > > > This looks like a very competent tool indeed and there's no doubt that > it will be very useful for a lot of cases. However I have no idea how > the mirror validation in the package will work, I just hope it will be > implemented in a way that will be usable without special tools. Having a > way to validate a mirror from within the ftp directory listing is very > valuable - especially to mirror scripts etc. > > It looks to me like the MirrorManager will notify the main site that the > sync has completed. This is useful, but probably not to other mirror > sites (or we may need specialized tools to perform the check). I could > easily be mistaken here, though! First, the problems with the current mirroring is really twofold: 1) the storage array backing the master rsync servers is undergoing some serious stress. This is causing it to be very slow for the master rsync servers that serve the data, thus the global mirror servers pulling from it are seeing very slow syncs. Red Hat I/T is working on it. (It isn't helping that the RHEL5 floodgates opened on Wednesday either - that just added stress to an already stressed set of people and colo networks). 2) the global mirrors aren't being notified when content has changed on the master, such that they should start a new rsync run. mirrormanager takes a per-host email address, which the master sign-and-push scripts will eventually send an email to when the content changes. As it stands, syncing every 6 hours when nothing has changed doesn't make any sense. Now, mirrormanager has 2 methods by which it can know a give mirror host is up-to-date. First is a new report_mirrors script that uploads directory data back to the database from the mirror itself. Not everyone will want to run that, and there's always the 'trust but verify' model, so I've also got a (fast?) crawler that crawls each host using HTTP HEADs and keepalives or FTP DIR calls looking for content it should be carrying compared to the master list, and tracking presence and up-to-date-ness on a per-directory level. Those that aren't up2date get dropped from the appropriate per-directory lists (e.g. the repodata dirs) in real time. That's the idea. A lot of the code is implemented, there's more to go. If you're good with python, turbogears, and the like, I'm sure I could put you to work on it. Drop me a note. Thanks, Matt -- Matt Domsch Software Architect Dell Linux Solutions linux.dell.com & www.dell.com/linux Linux on Dell mailing lists @ http://lists.us.dell.com -- fedora-devel-list mailing list fedora-devel-list@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/fedora-devel-list