Marcus, tnx for your info.
OS is centos 6 w kernel 2.6.32-504.30.3.el6.x86_64Let me try to explain:
on non-smp with traffic at ~300mbits we have load of ~4 (on 6 workers).
in that case, actual user time is about 10-20% and 70-80% is sys time (osq_lock) and there are no connection timeouts.
If I switch to SMP 6 workers user time goes up but sys time goes up too and there are connection timeouts and the load jumps to ~12.
If I give it more workers only load jumps and more connections are being dropped to the point that load goes to 23/24 and the entire server is slow as hell.
So, best performance so far are with 6 non-smp workers.
For now I have 2 options:
1. Install older squid (3.1.10 centos repo) and try it then
2. build custom 64bit kernel with RCU and specific cpu family support (in progress).
The end idea is to be able to sustain 1gig of traffic on this server :)
Any advice is welcome
J.
2015-07-31 14:53 GMT+02:00 Marcus Kool <marcus.kool@xxxxxxxxxxxxxxx>:
osq_lock is used in the kenel for the implementation of a mutex.
It is not clear which mutex so we can only guess.
Which version of the kernel and distro do you use?
Since mutexes are used by Squid SMP, I suggest to switch for now to Squid non-SMP.
What is the value of cpu_affinity_map in all config files?
You say they are static. But do you allocate each instance on a different core?
Does 'top' show that all CPUs are used?
Do you have 24 cores or 12 hyperthreaded cores?
In case you have 12 real cores, you might want to experiment with 12 instances of Squid and then try to upscale.
Make maximum_object_size large, a max size of 16K will prohibit the retrieval of objects larger than 16K.
I am not sure about 'maximum_object_size_in_memory 16 KB' but let it be infinite and do not worry since
cache_mem is zero.
Marcus
On 07/31/2015 03:52 AM, Josip Makarevic wrote:
2015-07-31 0:42 GMT+02:00 Amos Jeffries <squid3@xxxxxxxxxxxxx <mailto:squid3@xxxxxxxxxxxxx>>:Hi Amos,
cache_mem 0
cache deny all
already there.
Regarding number of nic ports we have 4 10G eth cards 2 in each bonding interface.
Well, entire config would be way too long but here is the static part:
via off
cpu_affinity_map process_numbers=1 cores=2
forwarded_for delete
visible_hostname squid1
pid_filename /var/run/squid1.pid
icp_port 0
htcp_port 0
icp_access deny all
htcp_access deny all
snmp_port 0
snmp_access deny all
dns_nameservers x.x.x.x
cache_mem 0
cache deny all
pipeline_prefetch on
memory_pools off
maximum_object_size 16 KB
maximum_object_size_in_memory 16 KB
ipcache_size 0
cache_store_log none
half_closed_clients off
include /etc/squid/rules
access_log /var/log/squid/squid1-access.log
cache_log /var/log/squid/squid1-cache.log
coredump_dir /var/spool/squid/squid1
refresh_pattern ^ftp: 1440 20% 10080
refresh_pattern ^gopher: 1440 0% 1440
refresh_pattern -i (/cgi-bin/|\?) 0 0% 0
refresh_pattern . 0 20% 4320
acl port0 myport 30000
http_access allow testhost
tcp_outgoing_address x.x.x.x port0
include is there for basic ACL - safe ports and so on - to minimize config file footprint since it's static and same for every worker.
and so on 44 more times in this config file
Do you know of any good article hot to tune kernel locking or have any idea why is it happening?
I cannot find any good info on it and all I've found are bits and peaces of kernel source code.
Tnx.
J.
squid-users@xxxxxxxxxxxxxxxxxxxxx <mailto:squid-users@xxxxxxxxxxxxxxxxxxxxx>
On 31/07/2015 8:05 a.m., Josip Makarevic wrote:
> Hi,
>
> I have a problem with squid setup (squid version 3.5.6, built from source,
> centos 6.6)
> I've tried 2 options:
> 1. SMP
> 2. NON-SMP
>
> I've decided to stick with custom build non-smp version and the thing is:
> - i don't need cache - any kind of it
cache_mem 0
cache deny all
That is it. All other caches used by Squid *are* mandatory for good
performance. And are only used anyway when the component that needs them
is actively used.
> - I have DNS cache just for that
> - squid has to listen on 1024 ports on 23 instances.
> each instance listens on set of ports and each port has different outgoing
> ip address.
And how many NIC do you have that spread over?
>
> The thing is this:
> It's alll good until we hit it with more than 150mbits then...
>
> (output from perf top)
> 84.57% [kernel] [k] osq_lock
> 4.62% [kernel] [k] mutex_spin_on_owner
> 1.41% [kernel] [k] memcpy
> 0.79% [kernel] [k] inet_dump_ifaddr
> 0.62% [kernel] [k] memset
>
> 21:53:39 up 7 days, 10:38, 1 user, load average: 24.01, 23.84, 23.33
> (yes, we have 24 cores)
> Same behavior is with SMP and NON-SMP setup (SMP setup is all in one file
> with workers 23 option but then I have to use rock cache)
>
> so, my question is....what...how to optimize this.....whatever....I'm stuck
> for days, I've tried many sysctl options but none of them works.
> Any help, info, something else?
None of those are Squid functionality. If you want help optimizing your
config and are willing to post it to the list I am happy to do a quick
audit and point out any problem areas for you.
But tuning the internal locking code of the kernel is way off topic.
Amos
_______________________________________________
squid-users mailing list
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users