On Wed, Jul 31, 2013 at 1:44 AM, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote: > On 07/30/2013 06:44 AM, Tim Murray wrote: > >> I'm running Squid 3.3.5 on 3 multicore systems here, using SMP and 6 >> workers per server dedicated to their own core. Each one running OS >> RHEL6 U4 with 2.6.32 kernel. >> >> I'm noticing as time goes on, some workers seem to be favoured and >> doing the majority of the work. I've read the article regarding SMP >> Scaling here: >> >> http://wiki.squid-cache.org/Features/SmpScale >> >> However I'm find our workers CPU time is differing quite substantially; > > As discussed on the above wiki page, this is expected. We see it all the > time on many boxes, especially if Squid is not very loaded. IIRC, the > patch working around that problem has not been submitted for the > official review yet -- no free cycles to finish its polishing at the moment. > > >> I can also see the connections differ massively between the workers: > > Same thing. > > >> I'm a little concerned that the more people I migrate to this solution >> the more the first 1 or 2 workers will become saturated. Do the >> workers happen to have some form of source or destination persistance >> for (SSL?) connections or something that might be causing this to >> occur? > > The wiki page provides the best explanation of the phenomena I know > about. In short, some kernels (including their TCP stacks) are not very > good at balancing this kind of server load. > > >> And is there anything I can do to improve the distribution between >> workers? > > I am not aware of any specific fix, except for the workaround patch > mentioned on the wiki. > > > Alex. > Thank you very much for that Alex, to be honest when I read the Wiki page I had assumed this patch had already been implemented. In the meantime, I might see if using separate http_ports for each worker and using the load balancer to even up the spread of traffic will work.