Search squid archive

Re: caching data for thousands of nodes in a compute cluster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Trying again, having got no response.  Any reaction to my questions?
>
> - Dave
>
> On Tue, Jun 12, 2007 at 11:42:42AM -0500, Dave Dykstra wrote:
>> On Tue, Jun 12, 2007 at 12:19:26AM +0200, Henrik Nordstrom wrote:
>> > m??n 2007-06-11 klockan 15:17 -0500 skrev Dave Dykstra:
>> >
>> > > of jobs.  It quickly becomes impractical to distribute all the data
>> from
>> > > just a few nodes running squid, so I am thinking about running squid
>> on
>> > > every node, especially as the number of CPU cores per node
>> increases.
>> > > The problem then is how to determine which peer to get data from.
>> >
>> > Multicast ICP sounds like it could be a reasonable option there.
>> >
>> > Regards
>> > Henrik
>>
>> I considered that, but wouldn't multicasted ICP queries tend to get many
>> hundreds of replies (on average, half the total number of squids)?  It
>> would only use the first response it got back, but it doesn't seem very
>> efficient of network or compute resources to throw away all the others.
>> Do you know of other people who have used multicast ICP for this type of
>> application?
>>
>> The multicast TTL could help a little but probably not much.  I expect
>> the servers are usually organized in smaller groups, with better network
>> connectivity within each group, but it isn't practical to ask the system
>> administrators to tell us which servers are in which group so everything
>> has to be automatic.  They're very likely all on the same large subnet
>> with the switches sorting out the routing, so it isn't clear that
>> anything at squid's level would be able to tell how far away servers are
>> other than by small differences in response time, or more likely
>> throughput of large transfers.  I also don't think we can really expect
>> we know can know the names of all the peers in order to list them in
>> "multicast-responder".
>>
>> - Dave
>

There are some neighbour-discovery features of IPv6 that offer options in
this area. The drawbacks there are:
  The host network between squids MUST be able to handle IPv6 traffic
properly, and with the current squid that means dual-stack linux in some
form.
  It hasn't been written or even experimented with yet AFAIK. So some
sponsorship will be needed to get me or someone doing it earlier than a
few years away.

Amos
PS. yes folks squid3-ipv6 branch is in Beta testing now.



[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux