Bill Cox writes: > As for why local mirrors aren't popular... I suspect it's because there > are a LOT of sites out there that people download from. You can't > mirror them all locally, unless you're a huge ISP. However, we could > mirror them all with BT based servers. I could probably serve RPMs to > the entire FC4 community from my laptop over wireless. Actually, I think that 99% of all people install from a (mirror) site and end up running yum pretty much off of that site (or a mirror of that (mirror) site) forever. After all, that gets the person FC X, complete, often plus some extensions, and FC is fairly functionally complete as is, especially if dressed up with a few extensions supported by a local sysadmin. A relatively FEW people (such as yourself) who obviously have mad skills and a deep thirst for 7337 applications, peruse not only their original install base and its update and local additions, but actively seek out sites that build significant, often cool, extensions, or build tools or packages left out of the base install for political or legal reasons. I do some of this myself, as (for example) I still have a ton of mp3's that I cannot rerip and hence need xmms-mp3 whether or not Red Hat is comfortable distributing it (until somebody figures out how to convert mp3->ogg without modulus noise and signal degradation/distortion). However, for those particular packages (and generally there are only a few of them) there are already multiple strategies for the 7337 that don't require bittorrent. e.g. turning access to their particular hosting repositories only when you need them (to install or search for packages). Locally mirroring only the particular part of those repositories required to satisfy the requisite dependency tree of only those packages you've installed. Setting up yum to ignore everything on those particular repositories BUT the packages you are interested in and their minimal support tree. Doing a lot of these things "by hand" (custom crafting a solution for just the few packages associated with a repository) is a good idea (if not strictly necessary) anyway and is very definitely a for-experts-only sort of thing to do. My own experience at least is that some of the extended/advanced super-repositories have enough hacks and patches and updated revision numbers that doing a SINGLE UPDATE with them in the standard update path will often end up with those sites "owning" your system as they de facto update 1/3 of your base packages, including many that you have no earthly interest in seeing updated outside of the standard, reasonably trustworthy, FC tree. To recover from just this sort of thing I've had to back off by doing a full reinstall just to take back control of a system where this happened, and now I am very careful not to do a "yum update" with repositories outside of my standard update set active. In other words, playing the multi-repository game is a job for experts in an expert-friendly field, not something that most users can manage. In this sense, bittorrent (enabled/implemented as you describe) will make things >>worse<< by encouraging people who LACK the skills to try to do exactly this sort of thing without any thought for or knowledge of the consequences. Updating from a list of 30 repositories isn't really all that wise or desireable -- it's sort of "shades of rpmfind" days, where nearly anybody could fubar their system by grabbing rpm's built on top of nearly any base and hey, maybe install them (maybe with --force) and maybe they'd work. Yum is brilliant and designed to TRY to keep you out of trouble, but it cannot work miracles and EACH complete repository you might include is usually built/layered for INTERNAL consistency -- sometimes (even often) at the expense of global "FC X" consistency. Every local "fix" or patch or addition that alters an rpm shared by a large chunk of the base increases divergence from the base and the base update stream. This is the basic problem with all package distribution schemes and the reason testing and validation is actually very important, at least to people who love stability and hate pain. Finally, there are the security issues to think of. When you say "fairly secure" below it makes me very nervous. Distributed distribution protocols (to me) imply a tremendous amount of distributed trust and "cannot" in general be made truly secure. That is, they probably can but the cost of true security is the implementation of central network authorities that sell you security for out of pocket money, as e.g. the toplevel SSL CA's do now for even the modest degree of security provided by at least knowing that the host you are contacting really is the host that you think it is and not some hacker kid down the hall with a laptop with a fast network interface. The internet punishes trust and rewards certainty. As things are now, you must trust the administrators of whatever site(s) you install from. The fewer administrators that there are in the mirror chain back to the primary distribution sites, the less of a chance that the rpm's you install will be trojanned. gpg signatures, md checksums, and the like are all simply lovely ways of verifying file validity, but they rely on having the correct original signature keys, a correct list of the checksums, and most of all, on users who don't turn them off the first time they try and install and don't have the keys installed and don't know how to get and install them. MY vote for yum's next serious extension is an SSL-verified key retrieval and installation tool, since without SSL (or some other toplevel certificate authority) in the chain one cannot even (really) "trust" the keys one downloads and installs assuming one DOES know how to find, download, and install the right keys for some distribution or repository. Figuring out spoofs, redirects, man in the middle attacks, and so on to circumvent just pointing a browser at what you think is the right URL is left as an exercise for the studio audience, as is the rather scary estimate of the number of sites/users that currently just turn off gpg checking the first time they want a signed rpm from a site that has new/different keys that they don't know how to install. Bittorrent will just make all of this worse, in every way, I think, if only by ENCOURAGING users to start "shopping" dozens of repositories instead of just one or two that are probably sufficiently "local" that the network in between is approximately trustworthy. I could be wrong, of course, but I think. > As for technical issues in BT, they can all be addressed, but a new > protocol will have to be implemented. It can be similar to BT (even a > strict super-set). However, it will be a new protocol. For me, that's > the fun part. And now to the constructive part of my comment. There are always many ways to solve any problem. One way to solve THIS problem (implementing BT in a "urlgrabber"-like link in the yum installation/retrieval chain) is, as suggested, to build a complex extension of BT that can be directly integrated into yum and, with a fair bit of tweaking and patching and addressing security concerns, eventually turn the entire FC-X installed base into some sort of massive rpmfind entity. (Brrrr. Icy fingers just played up and down my spine:-) OR, you can just write a BT client to automagically build and update a local mirror using data in /etc/yum.repos.d/ and yum's idea of what is currently installed. Whoa, that sounds like it would actually be pretty EASY. You could probably even use yum, or components of yum, to help -- you'd still likely want/need some sort of extended dependency resolution and the ability to select just PARTS of e.g. dag so as to not de facto overwrite the entire master FC X base+updates with FC Xdag base+updates (often different and more advanced in release number but tested outside of the master FC x test/validation process in ways you may or may not choose to trust, not picking on dag who after all I use to grab certain things myself:-). You'd also want to arrange it so that "yum list" or "yum info" referred back to the original sites (to get data on stuff that isn't locally mirrored), but a "yum install" precipitates automagical mirroring of the installed package and its dependencies. Easy and useful. As a number of people have suggested, local updates completely solve the bandwidth problem and the problem of making yum "immediately" usable with access to complete repositories. With 160 GB disks available for $70 (with rebate) at Circuit City, it is clear that nearly anybody 7337 can "afford" to keep a local update mirror on anything bigger than an old laptop (and everybody else will continue to just use their original base anyway). If you set up a BT client that does something like: a) Grab and mirror the repo(s) listed as primary mirrorlist entries, and install it (them) on the LOCAL path(s) of the primary repo(s) on a given system. Or something like this, the point being to enable something like: [base] name=Fedora Core $releasever - $basearch - Base baseurl=/yum_repos/dulug/fc$releasever/$basearch mirrorlist=http://fedora.redhat.com/download/mirrors/fedora-core-$releasever enabled=1 gpgcheck=1 as pure automagic, creating the baseurl if it doesn't exist and putting a copy of fedora.redhat.... into it. b) Search down the dependency trees of selected extension repos/sites and mirror them TO THE EXTENT that you have entities installed from those sites (only). Here you might even need to extend the functionality of yum, or use some of its more rarely used features, to e.g. prevent your system from mirroring ALL of dag just because you have xmms-mp3 installed, or updating from dag (and replacing 1/3 of your operational system or more from it). Here the building of a robust solution will be tricky, but not because of BT. Of course, for this type of thing rsync still seems like it would be an easier tool to use as a base and is what I and most others use in their custom scripts that accomplish pretty much this same thing. It leaves you with complete control over the resources devoted to maintaining mirrors or updating. Again to my own direct experience, bittorrent-like solutions can really suck every bit of bandwidth out of a DSL link as your system is being used to provide chunks of this and that to complete strangers in exchange for the dubious benefit of being able (far more rarely) to get chunks of this and that from them. Anyway, you get the idea. The basic principle here is that BT or rsync or ftp -- grabbing truly remote rpm's in an update in real time sucks, especially over a DSL link. It's ok for an occasional install of a small package -- otherwise it just burns lifetime. It is therefore desireable to trade some of your cheap, readily accessible disk for time by pre-downloading and mirroring the RELEVANT part of the repositories you have in your "permanent" repo list -- all of the base, the relevant minimal chains for package specific additions. Finally, you need some really smart logic to at least try to help naive users from screwing themselves by layering enough repositories on top of each other that -- "FC X" or not -- they eventually become de facto incompatible and break the shit out of everything. > BTW, I don't have a good name for the utility, assuming I do work on it. > Is BTFS (for BitTorrent File System) any good? I was also hoping to > build a fuse interface to the utility that would allow you to mount the > served directory structure as a local disk. Is BT-FTP better? "BTFS" sounds like it is a project in and of itself, if you really plan to create an actual filesystem (that one mounts and everything). Again the basic concept itself sounds like a sysadmin's worst nightmare from a security and resource control point of view -- lots of strangers putting lots of files over which you have no control on a locally mounted "virtual" filesystem, lots of strangers using your resources and bandwidth to provide and retrieve those files to you and from you. This is serious business, and before you start you should investigate carefully the expected scaling of the solution, other applications that might use it, how in the world you are going to secure it and keep me (as root on one of the participating sites) from slipping trojans into the distribution chain for any of those applications. This isn't like distributing data-only files -- music or video. There the risk is that a binary that might be used to read them has a buffer overwrite attack that can be exploited by carefully crafted data, which HAPPENS often enough but isn't horribly LIKELY for all of that. You're distributing the core libraries and binaries from which a system is built. Simply inserting a SINGLE RPM into the chain that is "guaranteed" to be an update on all downstream hosts would compromise basically everybody in the universe that had gpgcheck=0. Presuming that your tool is a SUCCESS and that pretty much all the independent FC X build sites start to participate (at least tens of them), each with their own personal gpg signatures or no signature at all, the inclination for sites that use your tool to skip the signature check as default behavior will be overwhelming. Unless, as noted above, somebody FIRST fixes yum so that it can securely retrieve keys, which may require some sort of "registration" of trusted sites and their SSL identifiers or any of the usual nightmarish network of extending trust across a fundamentally insecure network. This is essential (IMHO) to make yum "safe" for individual non-sysadmin-type users who don't really know what a gpg key IS let alone how to retrieve it (securely!) and install it. In the meantime, using multiple repositories and shopping far and wide for exotic packagings of this and that is at your own risk, YMMV, don't blame yum if you break your system and have to reinstall or live with bizarre bugs. rgb -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.dulug.duke.edu/pipermail/yum/attachments/20050630/2db99bd8/attachment-0001.bin