Re: Instant Mirror Status...?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



James Antill wrote:
>
Furthermore, I absolutely don't want to return the same mirror at the
top of the list _for everyone_ in a given country.
Hash MM's "primary" IP address to select one of the various available
mirrors, assuming they're returned in a consistent order?
If you are going to return a list of N mirrors, make N copies of that list, rotating one position for each. Knock the last octet off the source IP and hash the remaining part with some consistent algorithm that will give you N values and use that to choose the copy of the list you send.

 Which is much harder than it sounds given that MM can't actually "make
N copies" of each list of IPs it might send out. But...

If you can get the list in a fixed order, you just have to replace the code that randomizes it with something that isn't 'worst-possible-case' for a site with a caching proxy. You could get some improvement simply by setting cache control headers on the list for some reasonable time - but then it is much harder to correct a mistake.

Everything is as distributed and robust as before, but you don't defeat attempts to save your bandwidth with caching proxies.

 This is _only_ true if you are getting asked for the list from every
single IP address, or that the subset of IP addresses you are getting
asked from happen to be as random/distributed as what MM does now.

That's up to the hashing algorithm. I'm not an expert, but someone should be able to pick one that can take the first 3 octets of an IP address as input and give an essentially random distribution. For brute force you could convert the address to ascii, md5 it, then take modulo the number of list items as the starting point. There's probably something much more efficient, but that should give you randomness. I'd drop the last octet so clustered proxies in the same class C subnet or behind NAT gateways with multiple public addresses would get the same list.

 You might argue that it'll probably "random/distributed enough", but I
find it much easier to believe that the above will solve your problem
and you didn't get much further than that in your analysis.

It isn't 'my' problem. It's everyone's problems that the mirrors have to send many times the number of copies that they would if you stop going out of your way to defeat existing caching infrastructure. And I intentionally left the choice of hashing algorithm up to someone who is more familiar with their nature. Personally, I don't think it can get any worse than it is so I'm probably not qualified for the analysis you'd like. As long as you keep giving the whole list, the clients will find something that works even if it isn't optimal. Or maybe yum could look for proxy headers on the response and (optionally) randomize by itself if there are none.

--
  Les Mikesell
   lesmikesell@xxxxxxxxx



--
fedora-devel-list mailing list
fedora-devel-list@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/fedora-devel-list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]
  Powered by Linux