RE: Too many open files

"Peter Retief" <peter@xxxxxxxxxxxxx> · Sun, 28 Jul 2013 08:19:34 +0200

>> Peter:
>> Do you mean you've patched the source code, and if so, how do I get 
>> that patch?  Do I have to move from the stable trunk?

> Amos:
> Sorry yes that is what I meant and it can now be found here:
>
http://www.squid-cache.org/Versions/v3/3.HEAD/changesets/squid-3-12957.patch
> It should apply on the stable 3.3 easily, although I have not tested that.
> NP: if you rebuild please go with the 3.3.8 security update release.

I have patched the file as documented, and recompiled with the 3.3.8 branch

>> Peter:
>> The first log occurences are:
>> 2013/07/23 08:26:13 kid2| Attempt to open socket for EUI retrieval
failed:
>> (24) Too many open files
>> 2013/07/23 08:26:13 kid2| comm_open: socket failure: (24) Too many 
>> open files
>> 2013/07/23 08:26:13 kid2| Reserved FD adjusted from 100 to 15394 due 
>> to failures

> Amos:
> So this worker #2 got errors after reaching about 990 open FD (16K -
15394). Ouch.

> Note that all these socket opening operations are failing with the "Too
many open files" error the OS sends back when limiting Squid to 990 or so
FD. This has confirmed that Squid is not mis-calculating > where its limit
is, but something in the OS is actually causing it to limit the worker. The
first one to hit was a socket, but also a disk file access is getting them
soon after so it is likely the global OS limit
> rather than a particular FD type limit. That 990 usable FD is also
suspiciously close to 1024 with a few % held spare for emergency use (as
Squid does when calculating its reservation value).

Amos, I don't understand how you deduced the 990 open FD from the error
messages above ( "adjusted from 100 to 15394")?  I would have deduced that
there was some internal limit of 100 (not 1000) FD's, and that squid was
re-adjusting to the maximum currently allowed (16K)?  Where is my logic
wrong, or what other information led to your conclusion?  It is important
for me to understand, as I think I have addressed the maximum file
descriptors:

/etc/security/limits.conf includes:
#       - Increase file descriptor limits for Squid
*               soft    nofile          65536
*               hard    nofile          65536
root            soft    nofile          65536
root            hard    nofile          65536

/etc/pam.d/common-session* includes:
# Squid requires this change to increase limit of file descriptors
session required        pam_limits.so

After a reboot, if I login as root or squid, "ulimit -Sn" gives 65536

I included the following options to my squid ./configure script:
./configure  \
  --prefix=/usr \
  --localstatedir=/var \
  --libexecdir=${prefix}/lib/squid \
  --srcdir=. \
  --datadir=${prefix}/share/squid \
  --sysconfdir=/etc/squid \
  --with-default-user=proxy \
  --with-logdir=/var/log \
  --with-pidfile=/var/run/squid.pid \
  --enable-snmp \
  --enable-storeio=aufs,ufs \
  --enable-async-io \
  --with-maxfd=65536 \
  --with-filedescriptors=65536 \
  --enable-linux-netfilter \
  --enable-wccpv2

Here is the output of mgr:info a short while after starting up again:

HTTP/1.1 200 OK
Server: squid/3.3.8
Mime-Version: 1.0
Date: Sun, 28 Jul 2013 06:16:09 GMT
Content-Type: text/plain
Expires: Sun, 28 Jul 2013 06:16:09 GMT
Last-Modified: Sun, 28 Jul 2013 06:16:09 GMT
Connection: close

Squid Object Cache: Version 3.3.8
Start Time:     Sun, 28 Jul 2013 06:14:31 GMT
Current Time:   Sun, 28 Jul 2013 06:16:09 GMT
Connection information for squid:
        Number of clients accessing cache:      20
        Number of HTTP requests received:       1772
        Number of ICP messages received:        0
        Number of ICP messages sent:    0
        Number of queued ICP replies:   0
        Number of HTCP messages received:       0
        Number of HTCP messages sent:   0
        Request failure ratio:   0.00
        Average HTTP requests per minute since start:   1078.8
        Average ICP messages per minute since start:    0.0
        Select loop called: 598022 times, 1.093 ms avg
Cache information for squid:
        Hits as % of all requests:      5min: 1.6%, 60min: 1.6%
        Hits as % of bytes sent:        5min: 0.2%, 60min: 0.2%
        Memory hits as % of hit requests:       5min: 37.0%, 60min: 37.0%
        Disk hits as % of hit requests: 5min: 27.8%, 60min: 27.8%
        Storage Swap size:      72074368 KB
        Storage Swap capacity:   2.9% used, 97.1% free
        Storage Mem size:       8640 KB
        Storage Mem capacity:    3.3% used, 96.7% free
        Mean Object Size:       22.30 KB
        Requests given to unlinkd:      0
Median Service Times (seconds)  5 min    60 min:
        HTTP Requests (All):   0.47928  0.47928
        Cache Misses:          0.48649  0.48649
        Cache Hits:            0.02796  0.02796
        Near Hits:             0.00000  0.00000
        Not-Modified Replies:  0.00000  0.00000
        DNS Lookups:           0.16304  0.16304
        ICP Queries:           0.00000  0.00000
Resource usage for squid:
        UP Time:        98.555 seconds
        CPU Time:       11.945 seconds
        CPU Usage:      12.12%
        CPU Usage, 5 minute avg:        16.44%
        CPU Usage, 60 minute avg:       16.44%
        Process Data Segment Size via sbrk(): 594624 KB
        Maximum Resident Size: 3144976 KB
        Page faults with physical i/o: 0
Memory usage for squid via mallinfo():
        Total space in arena:  595416 KB
        Ordinary blocks:       594314 KB    619 blks
        Small blocks:               0 KB      0 blks
        Holding blocks:        307784 KB     50 blks
        Free Small blocks:          0 KB
        Free Ordinary blocks:    1102 KB
        Total in use:            1102 KB 0%
        Total free:              1102 KB 0%
        Total size:            903200 KB
Memory accounted for:
        Total accounted:       467181 KB  52%
        memPool accounted:     467181 KB  52%
        memPool unaccounted:   436019 KB  48%
        memPoolAlloc calls:   6905819
        memPoolFree calls:    6911935
File descriptor usage for squid:
        Maximum number of file descriptors:   393216
        Largest file desc currently in use:    501
        Number of file desc currently in use:  850
        Files queued for open:                   0
        Available number of file descriptors: 392366
        Reserved number of file descriptors:   600
        Store Disk files open:                   2
Internal Data Structures:
        3233010 StoreEntries
           352 StoreEntries with MemObjects
           270 Hot Object Cache Items
        3232658 on-disk objects

Can anyone see any problems, before I expand the affected ip ranges to allow
the full client load?