---------- Пересланное сообщение ---------- Тема: Re: squid3.0 STABLE16 cpu load again Дата: 20 ноября 2009 Отправитель: fedorischev <fedorischev@xxxxxxxxxx> Получатель: squid-users@xxxxxxxxxxxxxxx В сообщении от Friday 20 November 2009 11:09:34 Angelo Höngens написал(а): > On 20-11-2009 9:01, fedorischev wrote: > > Hello Squid list ! > > > > I having a problem with Squid CPU usage. The environment is described > > below: > > > > VMWare ESXi guest system - CentOS 5.3 x86_64 running two Squid's. This > > system have 3 Gb RAM, > > two CPU's each other 3.40 GHz. > > First Squid is running with RAM-cache size 1 Gb. Second Squid is running > > without any cache and use first as parent cache. Both are > > SQUID3.0-STABLE16. > > > > Here is some parts of config below: > > > > #first Squid: /etc/squid/squid.conf > > > > http_port 8080 > > acl QUERY urlpath_regex cgi-bin \? > > cache deny QUERY > > cache_mem 1024 MB > > maximum_object_size 1024 KB > > maximum_object_size_in_memory 1024 KB > > cache_replacement_policy heap GDSF > > memory_replacement_policy heap GDSF > > cache_dir null /null > > ...skipped access logs... > > > > #we use third-side mysql authorization program > > auth_param basic program /usr/lib64/squid/mysql_auth > > auth_param basic children 20 > > auth_param basic realm Squid proxy-caching web server > > auth_param basic credentialsttl 20 second > > auth_param basic casesensitive off > > > > ...skipped refresh patterns... > > > > ...skipped 34 ACL directives - non-regex based mainly > > > > http_reply_access allow all > > > > ...skipped log access... > > > > cache_effective_user nobody > > visible_hostname proxy.bsu.edu.ru > > > > ...and something else > > > > #second Squid: /etc/squid-noauth/squid.conf > > > > http_port 8082 > > #use first parent > > cache_peer localhost parent 8080 0 no-query > > hierarchy_stoplist cgi-bin ? > > #caching is disabled to everything > > no_cache deny all > > cache_mem 8 MB > > cache_dir null /dev/null > > ...skipping cache logs... > > #in this Squid we have something more ACls - time-based mainly and a > > little bit of regex-based > > acl class0 src 172.18.31.1-172.18.31.199/32 > > acl class0_time time SMTWHFA 00:00-23:59 > > http_access allow class0 class0_time > > #we have about 35 class sections described above > > visible_hostname proxy1.bsu.edu.ru > > memory_pools off > > forwarded_for on > > > > ...about 7 delay pool classes > > > > It's not full configs, but I may present it if would be necessary. > > The problem is not so simple: the second squid is periodically utilize > > CPU and this is not depends on proxy users' activity. Usually this squid > > load about 5-12 % CPU, but sometimes loading is hardly increasing - about > > 70-90 % CPU (top reports it), while squidclient is reporting only 30-40% > > load. Squid is continue working wery well under load, but why it's so > > highly loading ? In any case, I repeat, this is not depends on users > > activity. > > > > Here is squidclient report on this proxy: > > # squidclient -p 8082 mgr:info > > > > HTTP/1.0 200 OK > > Server: squid/3.0.STABLE16 > > Mime-Version: 1.0 > > Date: Fri, 20 Nov 2009 07:04:21 GMT > > Content-Type: text/plain > > Expires: Fri, 20 Nov 2009 07:04:21 GMT > > Last-Modified: Fri, 20 Nov 2009 07:04:21 GMT > > X-Cache: MISS from proxy1.bsu.edu.ru > > X-Cache-Lookup: MISS from proxy1.bsu.edu.ru:8082 > > Via: 1.0 proxy1.bsu.edu.ru (squid/3.0.STABLE16) > > Proxy-Connection: close > > > > Squid Object Cache: Version 3.0.STABLE16 > > Start Time: Thu, 19 Nov 2009 07:32:34 GMT > > Current Time: Fri, 20 Nov 2009 07:04:21 GMT > > Connection information for squid: > > Number of clients accessing cache: 87 > > Number of HTTP requests received: 403810 > > Number of ICP messages received: 0 > > Number of ICP messages sent: 0 > > Number of queued ICP replies: 0 > > Number of HTCP messages received: 0 > > Number of HTCP messages sent: 0 > > Request failure ratio: 0.00 > > Average HTTP requests per minute since start: 286.0 > > Average ICP messages per minute since start: 0.0 > > Select loop called: -1040018272 times, -0.081 ms avg > > Cache information for squid: > > Hits as % of all requests: 5min: 0.0%, 60min: 0.0% > > Hits as % of bytes sent: 5min: 0.4%, 60min: 1.4% > > Memory hits as % of hit requests: 5min: 0.0%, 60min: 0.0% > > Disk hits as % of hit requests: 5min: 0.0%, 60min: 0.0% > > Storage Swap size: 0 KB > > Storage Swap capacity: 0.0% used, 0.0% free > > Storage Mem size: 256 KB > > Storage Mem capacity: 3.1% used, 96.9% free > > Mean Object Size: 0.00 KB > > Requests given to unlinkd: 0 > > Median Service Times (seconds) 5 min 60 min: > > HTTP Requests (All): 0.05633 0.04277 > > Cache Misses: 0.05633 0.04277 > > Cache Hits: 0.00000 0.00000 > > Near Hits: 0.00000 0.00000 > > Not-Modified Replies: 0.00000 0.00000 > > DNS Lookups: 0.00000 0.00000 > > ICP Queries: 0.00000 0.00000 > > Resource usage for squid: > > UP Time: 84706.947 seconds > > CPU Time: 54585.067 seconds > > CPU Usage: 64.44% > > CPU Usage, 5 minute avg: 0.28% > > CPU Usage, 60 minute avg: 0.24% > > Process Data Segment Size via sbrk(): 5696 KB > > Maximum Resident Size: 0 KB > > Page faults with physical i/o: 2 > > Memory usage for squid via mallinfo(): > > Total space in arena: 15188 KB > > Ordinary blocks: 11857 KB 46 blks > > Small blocks: 0 KB 0 blks > > Holding blocks: 3572 KB 17 blks > > Free Small blocks: 0 KB > > Free Ordinary blocks: 3330 KB > > Total in use: 15429 KB 82% > > Total free: 3330 KB 18% > > Total size: 18760 KB > > Memory accounted for: > > Total accounted: 4846 KB 26% > > memPool accounted: 4846 KB 26% > > memPool unaccounted: 13913 KB 74% > > memPoolAlloc calls: 373989217 > > memPoolFree calls: 373985477 > > File descriptor usage for squid: > > Maximum number of file descriptors: 1024 > > Largest file desc currently in use: 73 > > Number of file desc currently in use: 64 > > Files queued for open: 0 > > Available number of file descriptors: 960 > > Reserved number of file descriptors: 100 > > Store Disk files open: 0 > > Internal Data Structures: > > 41 StoreEntries > > 41 StoreEntries with MemObjects > > 37 Hot Object Cache Items > > 0 on-disk objects > > > > > > And this is chunk of the "cache.log" that describes starting of squid: > > > > 2009/11/19 10:32:34| Starting Squid Cache version 3.0.STABLE16 for > > x86_64-redhat-linux-gnu... > > 2009/11/19 10:32:34| Process ID 6160 > > 2009/11/19 10:32:34| With 1024 file descriptors available > > 2009/11/19 10:32:34| DNS Socket created at 0.0.0.0, port 33274, FD 10 > > 2009/11/19 10:32:34| Adding domain bsu.edu.ru from /etc/resolv.conf > > 2009/11/19 10:32:34| Adding nameserver 172.16.1.1 from /etc/resolv.conf > > 2009/11/19 10:32:34| Adding nameserver 172.24.1.1 from /etc/resolv.conf > > 2009/11/19 10:32:34| User-Agent logging is disabled. > > 2009/11/19 10:32:34| Referer logging is disabled. > > 2009/11/19 10:32:35| Unlinkd pipe opened on FD 15 > > 2009/11/19 10:32:35| Local cache digest enabled; rebuild/rewrite every > > 3600/3600 sec > > 2009/11/19 10:32:35| Swap maxSize 0 + 0 KB, estimated 0 objects > > 2009/11/19 10:32:35| Target number of buckets: 0 > > 2009/11/19 10:32:35| Using 8192 Store buckets > > 2009/11/19 10:32:35| Max Mem size: 0 KB > > 2009/11/19 10:32:35| Max Swap size: 0 KB > > 2009/11/19 10:32:35| Using Least Load store dir selection > > 2009/11/19 10:32:35| Set Current Directory to /var/spool/squid-noauth > > 2009/11/19 10:32:35| Loaded Icons. > > 2009/11/19 10:32:35| Accepting HTTP connections at 0.0.0.0, port 8082, > > FD 14. 2009/11/19 10:32:35| HTCP Disabled. > > 2009/11/19 10:32:35| Accepting SNMP messages on port 3402, FD 16. > > 2009/11/19 10:32:35| Configuring Parent localhost/8080/0 > > 2009/11/19 10:32:35| Ready to serve requests. > > 2009/11/19 10:32:35| Finished rebuilding storage from disk. > > 2009/11/19 10:32:35| 0 Entries scanned > > 2009/11/19 10:32:35| 0 Invalid entries. > > 2009/11/19 10:32:35| 0 With invalid flags. > > 2009/11/19 10:32:35| 0 Objects loaded. > > 2009/11/19 10:32:35| 0 Objects expired. > > 2009/11/19 10:32:35| 0 Objects cancelled. > > 2009/11/19 10:32:35| 0 Duplicate URLs purged. > > 2009/11/19 10:32:35| 0 Swapfile clashes avoided. > > 2009/11/19 10:32:35| Took 0.30 seconds ( 0.00 objects/sec). > > 2009/11/19 10:32:35| Beginning Validation Procedure > > 2009/11/19 10:32:35| Completed Validation Procedure > > 2009/11/19 10:32:35| Validated 25 Entries > > 2009/11/19 10:32:35| store_swap_size = 0 > > 2009/11/19 10:32:36| storeLateRelease: released 0 objects > > > > > > Any tips, tricks ? What I'm missing ? Thank you. Regards. > > Did you execute the steps described in the following document? > > VMware Self-Service- Timekeeping best practices for Linux guests > (http://kb.vmware.com/kb/1006427) > > It could be that if your timekeeping is off and/or ntpd is not running > to correct tick duration, top might report incorrect information. Time synchronization through ntpd was(is) enabled - server time is correct. And thank you for information about VMWare timekeeping. I'll try to apply this recommendations & post results on list later. Regards, Igor. ------------------------------------------------------- OK, I think, I find CPU loading reason. I used strace to determine the CPU problem - traced read/write to watch Squid I/O activity. And this is the normal behavior of Squid: # strace -p 9161 2>&1 | grep read read(417, "CITTFaxDecode /DecodeParms << /K"..., 4094) = 4094 read(417, "de /DecodeParms << /K -1 /Column"..., 4094) = 4094 read(417, "j\r650 0 obj\r<< /Type /XObject /S"..., 4094) = 4094 read(10, "http://sitecheck2.opera.com/?hos"..., 8191) = 107 read(11, "http://clck.yandex.ru/redir/AiuY"..., 8191) = 414 read(417, "e /Width 8 /Height 32 /BitsPerCo"..., 4094) = 4094 read(417, "ct /Subtype /Image /Width 6 /Hei"..., 4094) = 4094 # strace -p 9161 2>&1 | grep write write(10, "http://an.yandex.ru/resource/con"..., 106) = 106 write(11, "http://an.yandex.ru/resource/con"..., 112) = 112 write(150, "HTTP/1.0 304 \r\nDate: Wed, 25 Nov"..., 312) = 312 write(315, "HTTP/1.0 304 \r\nDate: Wed, 25 Nov"..., 311) = 311 write(233, "HTTP/1.0 407 Proxy Authenticatio"..., 2222) = 2222 write(9, "1259159288.157 1 172.23.100"..., 102) = 102 write(11, "http://bs.yandex.ru/watch/7468?r"..., 1514) = 1514 write(10, "http://an.yandex.ru/code/7468?rn"..., 890) = 890 When CPU loading is increase to 80-90 % strace reports this: read(67, "\212", 1) = 1 read(57, "s", 1) = 1 read(67, "\331", 1) = 1 read(57, "\352", 1) = 1 read(67, "2", 1) = 1 read(57, "\252", 1) = 1 read(67, "\201", 1) = 1 read(57, "\213", 1) = 1 read(67, "\245", 1) = 1 read(57, "\331", 1) = 1 read(67, "\300", 1) = 1 read(57, "\27", 1) = 1 read(67, "\334", 1) = 1 read(57, "\207", 1) = 1 read(67, "\205", 1) = 1 read(57, "d", 1) = 1 read(67, "\300", 1) = 1 read(57, "\377", 1) = 1 read(67, "\24", 1) = 1 read(57, "\37", 1) = 1 write(23, "P", 1) = 1 write(38, "d", 1) = 1 write(23, "\253", 1) = 1 write(38, "w", 1) = 1 write(23, "\31", 1) = 1 write(38, "o", 1) = 1 write(23, "A", 1) = 1 write(38, "\353", 1) = 1 write(23, "\333", 1) = 1 write(38, "\300", 1) = 1 write(23, "I", 1) = 1 write(38, "S", 1) = 1 write(23, "d", 1) = 1 write(38, "(", 1) = 1 write(23, "D", 1) = 1 write(38, "\335", 1) = 1 write(23, "\341", 1) = 1 write(38, "\226", 1) = 1 This behavior is typical for one client - he is always downloading something from file sharing services e.g. rapidshare etc. In this example FDs 23,38,57,67 are occupied by him. I think (correct me if not right), that loading is concerned with non-rational one-byte syscalls - read/write always processed 1 byte for each call. Is this a bug, or, maybe something else ? Why Squid don't use block-read & write to avoid syscalls overhead ? WBR.