Re: Using bit torrent to retrieve RPMs for updates

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 26.02.2004 16:42, Jonathan Gardner wrote:
Has anyone given serious thought to changing Yum so that it uses the bittorrent protocol to retrieve RPMs? Especially in the case of updates, when everyone and their grandmother needs to get the RPMs right away, this would make a lot of sense. Yum could manage a repository of RPMs and constantly serve those up so other can download parts of them via bittorrent, all with permission, of course.

We have pondered this solution many times here, but there are several important drawbacks:


1. Bittorrent is highly inefficient for a large collection of small files. You will have to start a separate tracker item for each rpm, and for some of them the amount of traffic generated just tracking the p2p clients will outweigh the savings of using bittorrent. I would imagine that several thousands of tracker items would also be quite processor-intensive.
2. You have to specifically punch holes in the firewall for bittorrent -- not one, but a range of ports, actually. Something most people will not do, so they will be constantly leeching.
3. Yum runs as root, so you suddenly have a very large amount of code (yum+bittorrent libs) listening as root for incoming connections. Yikes. Alternatively, you'd have to fork a downloader process and communicate with it using some methods. Either way is painful.


As you see, bittorrent is not very beneficial. However, a bittorent-like system used by *mirrors* could be of benefit. E.g. the client-side connects to the main server and says "I want foo-1.0-1.i386.rpm". The server then returns:

Checksum information for foo-1.0-1.i386.rpm:
bytes 0...10000: chksum1
bytes 10000...20000: chksum2
....
bytes n-10000...n: chksum n
The following servers claim to have it:
mirror.fooland.foo
mirror.barland.bar
....
mirror.bazland.baz
Go get it yourself.

The client then connects to the mirrors and fetches the ranges specified in the server response, thus creating a primitive swarm. The fetching can be done via http, ftp, and file as they all support fetching by byte range.

This would allow for auto-balancing the mirror load, though this solution is not without its own set of difficulties:

1. This still keeps thousands of trackers on the server, though having dedicated servers and limited tracker traffic compared to bittorent would theoretically be easier.
2. How to keep the list of mirrors current? Should they stay constantly connected to the main server a la bittorrent clients? Should they use some other bittorent-like protocol for syncing with each-other?
3. As tracker info per each package would be auto-generated, there's no way to sign it (this would require keeping key on the server, which is no-no). Attackers could potentially annoy a lot of people by publishing bogus mirror data pointing to odd places. Though this isn't really dangerous, as after all the final RPM fetched from various servers by bits and pieces would be still cryptographically signed.


This could be a fun project to play with, if anyone likes to mess with things like that. :)

--
Konstantin ("Icon") Ryabitsev
Duke Physics Systems Admin, RHCE
I am looking for a job in Canada!
http://linux.duke.edu/~icon/cajob.ptml




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux