On 25/01/19 1:24 am, Marc wrote: > Hi, > > For some reason my squid sometimes hangs (after weeks of running > smoothly) in 100% steal, until I kill the proces and restart it, after > which the proces will again run stable for weeks. What does "100% steal" mean? > > It's running on a AWS EC2 instance, squid version: > squid-3.5.20-10.34.amzn1.x86_64 , see below for some debugging info. > Any idea what could be the problem here ? Thanks! > > top: > [11:56:49][root@ip-172-31-9-138 ~]# top > top - 11:57:11 up 218 days, 17:36, 1 user, load average: 1.06, 1.17, 1.09 > Tasks: 81 total, 2 running, 79 sleeping, 0 stopped, 0 zombie > Cpu(s): 4.5%us, 0.3%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 95.2%st > Mem: 501220k total, 405748k used, 95472k free, 65512k buffers > Swap: 0k total, 0k used, 0k free, 88948k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 29963 squid 20 0 290m 171m 7472 R 99.9 35.1 672:59.73 squid > 1 root 20 0 19648 2480 2148 S 0.0 0.5 0:02.05 init > <snip> > > vmstat: > [11:57:39][root@ip-172-31-9-138 ~]# vmstat 1 > procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- > r b swpd free buff cache si so bi bo in cs us sy id wa st > 1 1 0 95408 65536 89052 0 0 0 4 1 1 0 0 99 0 0 > 1 0 0 95408 65536 89040 0 0 0 4 56 36 5 0 0 0 95 > 2 0 0 95408 65536 89040 0 0 0 0 54 18 5 0 0 0 95 > 1 0 0 95408 65536 89040 0 0 0 0 57 30 5 0 0 0 95 > 1 0 0 95408 65536 89040 0 0 0 4 52 25 5 0 0 0 95 > 3 0 0 95408 65536 89040 0 0 0 0 52 14 6 0 0 0 94 > 1 0 0 95408 65536 89040 0 0 0 0 50 26 4 0 0 0 96 > 2 0 0 95408 65536 89040 0 0 0 0 53 21 6 0 0 0 94 > 1 0 0 95408 65540 89036 0 0 0 12 62 38 5 0 0 0 95 > 2 0 0 95408 65540 89040 0 0 0 36 55 14 5 0 0 0 95 > 1 0 0 95408 65540 89040 0 0 0 0 51 34 5 0 0 0 95 > > gdb: > [11:55:07][root@ip-172-31-9-138 ~]# sudo gdb -n -batch -ex backtrace -pid 29963 > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > 0x00000000007bca52 in > CbcPointer<Comm::TcpAcceptor>::operator=(CbcPointer<Comm::TcpAcceptor> > const&) () > #0 0x00000000007bca52 in > CbcPointer<Comm::TcpAcceptor>::operator=(CbcPointer<Comm::TcpAcceptor> > const&) () > #1 0x00000000007bc3d4 in Comm::AcceptLimiter::kick() () > #2 0x0000000000721867 in AsyncCall::make() () > #3 0x00000000007259e2 in AsyncCallQueue::fireNext() () > #4 0x0000000000725e20 in AsyncCallQueue::fire() () > #5 0x00000000005b0089 in EventLoop::runOnce() () > #6 0x00000000005b0178 in EventLoop::run() () > #7 0x00000000006192cc in SquidMain(int, char**) () > #8 0x0000000000514b3b in main () > This looks like it may be one of the symptoms of <https://bugs.squid-cache.org/show_bug.cgi?id=4885> which was fixed in Squid-4.3 release. Please try the current Squid-4 release to see if the issue is already resolved. v3.5 is no longer supported, so if it is a bug we will need traces and replication using the current Squid (v4 or v5) version to have a realistic chance of anyone being able to fix it. Amos _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx http://lists.squid-cache.org/listinfo/squid-users