Greetings. With our recent woes about rsync hitting our storage kind of hard, I had an idea of how we could make things better for at least our tier1 mirrors and anyone who is actually paying attention. The only time our master mirrors change is when we push updates (once a day) or when epel7beta/rawhide finish. When we have branched that adds another set. Right now, mirrors (or any people who are rsyncing content) just hits the master mirrors N times a day and pulls any changes. Or if they are very smart (like a few folks on the mirrors list), they grab the fullfilelist and only sync if it's changed. Which still hits the full tree when they do. I think there's something we can do that might improve things: rsync has a ability to make "BATCH MODE" updates. (see 'man rsync' and the BATCH MODE section). - enhance the cron job that syncs updates to generate a batch on each push and place it on master mirrors. - enhance the rawhide/epel7beta/branched crons to generate a batch file for rawhide/epel7beta/branched. - Mirrors that are otherwise caught up, can just use the batch file for that day. If they are more than a day out of sync it won't help them, but if they are in sync it will really help them a lot. They can ignore all the metadata fetching and files that haven't changed and just get the actual things that have. - Interested parties can just run their sync to pull the days batch file(s). If they don't exist, then no sync has happened yet and they can do nothing. If they do, they can download the batch and run it. Caveats: - This won't help mirrors that do subsets of things unless they tweak the batch files (ie, if they exclude debuginfo or something) - This won't help anyone who doesn't opt in to using it. - This won't help anyone who is more than a day out of date, they will need to sync up normally first. - This may result in a "thundering herd" of syncs after the batch files appear. I suspect however it still may be a lot less load than people doing full useless syncs. Thoughts? kevin
Attachment:
signature.asc
Description: PGP signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure