Konstantin Ryabitsev wrote: > Well, there is another issue which your method does not address. If you > only torrent large files (say, over 50 Mb in size), then the vast > majority of packages is still downloaded from the server. What bogs down > large mirrors is not bandwidth -- we have it out the proverbial wazoo. > What bogs us down is processor and io load. If most requests are still > made to the servers (plus there is additional tracker load), does that > actually improve the situation? This is part of what we want to test. We are assuming that the BT overhead is to high for smaller files because of the design decisions, but we can't quantify it. A .torrent file will almost always be smaller than the file that it defines. If we can reduce the upload burden of a server from 1 copy of the file for every downloader to the torrent file + maybe 5 copies for every 100 peers, I think it will have a serious impact. Like I've said. We are planning this public beta to flesh out these issues and see what kind of an impact we can have. > I still think we're chasing the wrong goose trying to unite yum and > bittorrent, and it is better to have a bittorrent-like protocol for > syncing mirrors and a special download method for urlgrabber that would > grab byte-ranges from multiple servers. This would allow the mirror load > to drop significantly, thus improving the situation when half the world > sends us HTTP requests. I assume that most mirrors use rsync as it is, so I am not sure how useful switching to BT would be, but it is something we will look into. Another point to remember is that the server doesn't have to be the tracker or the only initial seed. A tracker requires little bandwidth and could be run by a smaller mirror, like my university (Lehigh University). We don't have the bandwitdh to run a full mirror, but we could run a tracker. Then as many repositories as we can get can start the inital seeding, they already have the file. When a new peer enters the swarm, it will contact all the repsoitories and request pieces of the file, which would reduce overall load on the server dramatically. This scenario assumes a lot, but I think if the system is viable, we will start to see things like this. > But I would like to see and play with your work. There are some > interesting points there. Thanks. I will post here when we are ready to open the beta to the public. Also, you mentioned that you run a mirror. One thing that would help us is to get our hands on a download log for a real mirror. This would allow us to make a more educated guess on which files should be served via BT instead of just using size as the deciding factor. PS - There are several new features coming down the pike for BitTorrent that could significantly impact this system. (1) Multi-Tracker specification - this would allow us to specify more than one tracker for each torrent. If the first went down, the downloader would just switch to the next one. (2) Peer Sharing - Two trackers running the same torrent can share peers. This will allow larger, longer lasting swarms. (3) Torrent File Sharing - Instead of a BT Instance for every file being shared via BT, we start one up for all the files we want to share via bt. Then the downloader only requests pieces from the files that it wants. This should reduce the load on the server and just maintain one swarm. We are looking into this one, but we don't have anything yet. Thanks, Jarret Raim