A Better Solution to the RHN Bandwidth Problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here's an idea I've been kicking around in my head for the last few days.
It's pretty long, but hopefully it's worth the read.

Summary:
RedHat (and other Linux companies who want to provide priority access to
open-source software for paying customers) need an ISO distribution mechanism
with the scalability of BitTorrent and the access controls of a system
like RedHat Network.  Since BitTorrent and RedHat's releases are open-
source software, we can modify BT and combine the result with RHN to
get a system that sufficiently meets both of these needs.


Problem:
The idea of this week's early access program was to increase the value of
subscribing to RedHat Network.  However, since access was restricted to only
paying RHN subscribers, RedHat could not use its existing network of public
ftp mirrors, and rhn.redhat.com couldn't provide enough bandwidth for us all
at once.  BitTorrent, on the other hand, had RH9 ISO's, which they made
available to everyone with their peer-to-peer-like download system, and it was
darn fast.  For example, Monday morning I was getting disc 1 at 3.5kB/s, with
an estimated download time of over 50 hours just for the first disc.  Monday
evening, I got all 3 ISO's from BitTorrent in around 8 or 9 hours, at up to
80kB/s.  The net result was that paying for RHN's priority download got you a
much slower and less reliable service than what was available elsewhere for
free.  What's worse, those of us who *are* RHN subscribers who used BitTorrent
helped to increase the disparity by uploading to non-subscribers.

Lots of people complained, but no one has proposed a solution to the
problem, other than for RedHat to buy more bandwidth--which they did (thanks
guys!).  So starting Tuesday morning speeds have been better.  But bandwidth
isn't cheap, and I think we'd all rather see the money go towards R&D, new
features, security fixes, or even "Profit!"


Solution: "Embrace and extend" BitTorrent.

To begin, suppose RedHat started using standard BitTorrent to distribute their
normal releases.  (* I think they should, with one reservation.)  Let's think
about how such a system would work:
  1.  A user with BT installed on his machine opens http://www.redhat.com
      in his browser.
  2.  The user clicks on a link to a .torrent file, say "redhat9.torrent"
  3.  BitTorrent pops up and the user picks where to save his files
  4.  The user's BitTorrent client connects to the .torrent file's Tracker
  5.  The Tracker directs him to other users
  6.  He actually downloads his ISO's from the other users

(BitTorrent experts, please correct me if I'm getting part of this wrong.
I used BT for the first time on Monday, so I'm still learning.)

To handle more clients at once, the web server could distribute clients among
a set of Trackers, and RedHat and its trusted mirror sites could run a set of
upload-only clients who already have the full distribution.  How to best do
this is an interesting problem in itself, but it's not particularly relevant
here.

Now suppose we want to restrict access to only paying RHN subscribers.  There
are three major changes we need to make:
  A.  In step 1, the user now goes to https://rhn.redhat.com and logs in with
      his username and password before being granted access to the .torrent.
  B.  In step 4, the tracker now requires proof that the user is a real RHN
      subscriber before it will direct him to peers for downloading.
  C.  In step 5, the peers now require proof that the user is a real RHN
      subscriber before they will allow him to download.

I'm no expert, but that looks a whole heckuva lot like Kerberos to me.  RHN
plays the role of Authentication Server, the Tracker is analagous to a Ticket
Granting Server, and the peers are like protected resources (printers,
filesystems, etc).  While real Kerberos deployments can be a lot of work, the
underlying protocol is conceptually simple, and problems with earlier versions
have been discovered and fixed.  Developing a very simple Kerberos-like
system such as this shouldn't be too hard.

Now the process looks like this:
  1.  The user goes to https://rhn.redhat.com and logs in with his username
      and password.
  2.  He locates and clicks on the .torrent, just like before, but now RHN
      sends him a ticket to allow his IP address to access the Tracker, along
      with (or embedded in) the .torrent file.
  3.  BT client starts up, just like before.
  4.  The BT client connects to the tracker and presents its ticket.
  5.  The Tracker verifies the ticket, then sends the new client a list of
      peers, along with tickets for each of them.
  6.  The user's BT client connects to the peers, using its new tickets.
  7.  Each peer verifies the ticket, then starts the actual data transfer.

Redhat.com's tasks have gone from:
  -authenticate users
  -transfer 2-4GB to each impatient subscriber
to:
  - authenticate users
  - transfer between a few kB and a few MB of checksums to each user
  - run Trackers and a few of the many upload-only clients

I think such a system could work.  It would drastically reduce the demand
on redhat.com (and thus the cost to RedHat Software) and improve download
speeds for users.  It also makes early access to new releases more exclusive,
which was the whole point of making RH9 available to us this week.  Some
subscribers will still choose to redistribute, and that's OK; everyone else
gets access to the ISO's next week through the official channels anyhow.
At least redistributing early under the new system requires some level of
effort; this week it's been automatic with BitTorrent.


Thanks to everyone for reading through all of this.  I've thought of a few
other details regarding an actual implementation, but they're not really
relevant at this time.

Suggestions? comments? flames?  I'd especially like to hear what the folks
@redhat.com think.

-charles



* My one reservation has to do with verifying the integrity of software
downloaded with BitTorrent.  md5sums should either come straight from the
distributor (RedHat) over an SSL channel with server authentication, or they
could be distributed with the ISO's if they're signed by the distributor and
some mechanism is in place for verifying that signature.  To make life easier
for the users, RH could even sign smaller chunks of data (say, every 16MB) so
that one or two malicious peers can't make you need to download the whole
ISO again.  (Again, I'm new to BT, so it might deal with malicious users
already.  I have not read anything that explicitly says it does.)






[Index of Archives]     [Fedora Users]     [Centos Users]     [Kernel Development]     [Red Hat Install]     [Red Hat Watch]     [Red Hat Development]     [Red Hat Phoebe Beta]     [Yosemite Forum]     [Fedora Discussion]     [Gimp]     [Stuff]     [Yosemite News]

  Powered by Linux