The result of the test with vm.pmap.pg_ps_enabled set to 1
is ... different than what I expected.
The values of vm.pmap.pde.p_failures and vm.pmap.pde.demotions
indicate that the page daemon has problems creating and
maintaining superpages. Maybe the load is too high
to create superpages or the algorithm of the page daemon
needs to be tuned.
Perhaps the FreeBSD list can offer more assistance
on the question of superpages and give an explanation
for the high values for vm.pmap.pde.p_failures and
vm.pmap.pde.demotions...
But one last try from me: The machine has 24 GB and Squid has
19 GB. I guess that on the first fork the OS demotes many
superpages because it needs to map the child process to
virtual memory and superpages cannot be swapped and therefore
will be demoted. The second fork demotes more superpages...
To make the first fork fast, Squid must be
less than 10 GB because Squid and its child fit within
physical memory. You will need a low value for
sleep_after_fork to do the exec before the next fork is
executed.
There are alternative solutions to the problem:
1. redesign the URL rewriter into a multithreaded application that
accepts multiple requests from Squid simultaneously (use
url_rewrite_concurrency in squid.conf)
This way there will be only one child process and only one fork on
'squid -k rotate'
2. redesign the URL rewriter where the URL rewriter rereads
its configuration/database on receipt of a signal and
send this signal every 24 hours without doing the 'squid -k rotate'
This way squid does not fork.
(maybe you want a 'squid -k rotate' once per week though).
3. modify the URL rewriter to accept multiple request
simultaneously (use url_rewrite_concurrency 4) and reduce the
number of rewriter to 4. The URL rewritor will not be modified
to be multithreaded but simply queues the requests.
This way Squid does only 4 forks instead of 15.
4. use less URL rewriters. You might get an occasional
'not enough rewriters' warning from Squid in which case the
redirector might be bypassed (use url_rewrite_bypass).
Depending on what exactly the URL rewriter does, this
might be acceptable.
This way Squid does less forks.
option 1 requires a lot of work and is probable the best solution.
option 2 requires a lot of work and and is less robust than option 1
because at the time of the reconfiguration it cannot serve requests from Squid.
option 3 is most likely very simple but it is unknown how much it helps.
option 4 is simple, but depending on the functionality
of the rewriter, it is or is not acceptable. You need to experiment
to see if it helps.
Marcus
Linda Messerschmidt wrote:
On Tue, Nov 24, 2009 at 9:52 PM, Marcus Kool
<marcus.kool@xxxxxxxxxxxxxxx> wrote:
Linda started this thread with huge performance problems
when Squid with a size of 12 GB forks 15 times.
I'm still not convinced that our memory usage is correct. Here's
where the cache_mem 10GB process is at right now:
Maximum Resident Size: 19376000 KB
Total accounted: 9735764 KB
Storage Mem size: 9296872 KB
338229 StoreEntries
338229 StoreEntries with MemObjects
So there is (9735764 - 9296872) 438,892 KB "overhead" (memory
accounted for that is not used for storing objects) and (19376000 -
9735764) 9,640,236 KB "unaccounted."
I will put the whole memory report at the bottom of this message
though it is very hard to read. :-)
Linda emailed me that she is doing a test with
vm.pmap.pg_ps_enabled set to 1 (the kernel will
transparently transform 4K pages into superpages)
which gives a big relief for TLB management
and will most likely reduce performance problems with
forks of very large processes.
I apologize I meant to send that to the whole list.
In any case, it did not help. This is from top while it was doing
daily rotation a few moments ago:
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
1205 squid 1 118 0 18650M 18675M CPU6 1 57:22 97.85% squid
74418 squid 1 118 0 16259M 24K CPU2 2 0:05 97.66% squid
It doesn't look like the superpages enabled had a lot of effect:
$ sysctl vm.pmap
vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 6673
vm.pmap.pv_entry_allocs: 105465793
vm.pmap.pv_entry_frees: 100913786
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 610560
vm.pmap.pc_chunk_allocs: 637695
vm.pmap.pc_chunk_count: 27135
vm.pmap.pv_entry_count: 4552007 <--------
vm.pmap.pde.promotions: 12657 <------
vm.pmap.pde.p_failures: 222291
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 11968 <-------
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 7330186
vm.pmap.pg_ps_enabled: 1
Here are similar readings from the 16GB machine, where the squid
process has VSZ of 8599M:
vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 16299
vm.pmap.pv_entry_allocs: 2196230824
vm.pmap.pv_entry_frees: 2193907723
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 13548719
vm.pmap.pc_chunk_allocs: 13562644
vm.pmap.pc_chunk_count: 13925
vm.pmap.pv_entry_count: 2323101 <------------
vm.pmap.pde.promotions: 0
vm.pmap.pde.p_failures: 0
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 0
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 4276871
vm.pmap.pg_ps_enabled: 0
Squid uses almost all of the memory in use on both systems, so the
rough value of:
No superpages: 2323101 / 8599 = 270 pages per MB
Yes superpages: 4552007 / 18650 = 244 pages per MB
That's not quite fair since it's the whole system's page table vs.
squid's address space but these machines are all about squid so it's
fair enough.
I added up all the system's VSZ and divided it into the updated number
of pages; the system average is just 4.554KB per page, so it seems
like there must be barely any super pages in use, which is borne out
by the (promotions - demotions) figure of (12657 - 11968) 689 active
superpages. Assuming that means what I think it means.
can we modify Squid to fork at a slower pace?
I think there is a config option (sleep_after_fork) for that already.
But it would not help as the processes are already forked one at a
time and the page table duplication happens before the exec. Even if
it did help, sleep_after_fork would have to be several seconds, and
squid would still be nonresponsive for sleep_after_fork * 16.
It sounds like this problem is a bit more serious and complicated than
I thought. I really hoped it was a config error on our part. :-(
Here is the full memory report as the process VSZ crossed over to
19,001 MB, which shows a total of 9,317,346 KB in use:
Current memory usage:
Pool Obj Size Allocated In Use Idle Allocations Saved Hit Rate
(bytes) (#) (KB) high (KB) high (hrs) impact (%total) (#)
(KB) high (KB) portion (%alloc) (#) (KB) high (KB) (number)
(%num) (%vol) (%num) (number)
2K Buffer 2048 96 192 248 0.52 0 91 182 248 95
5 10 94 34958835 4.42 18.12 99.96 34972340
4K Buffer 4096 199 796 3904 11.98 0 194 776 3904
97 5 20 1720 7225790 0.91 7.49 98.41 7342427
8K Buffer 8192 21 168 272 12.05 0 17 136 272 81
4 32 176 1880965 0.24 3.90 98.40 1911548
16K Buffer 16384 3 48 64 13.39 0 0 0 64 0 3 48
64 3600 0.00 0.01 98.41 3658
32K Buffer 32768 2 64 128 14.56 0 0 0 128 0 2
64 128 369 0.00 0.00 98.14 376
64K Buffer 65536 2 128 192 12.81 0 0 0 192 0 2
128 128 88 0.00 0.00 95.65 92
Store Client Buffer 4096 22 88 1052 12.05 0 11 44 1052 50
11 44 812 30568572 3.87 31.69 99.74 30646811
acl 64 11 1 1 15.35 0 11 1 1 100 0 0
1 11 0.00 0.00 50.00 22
acl_ip_data 24 7 1 1 15.35 0 7 1 1 100 0 0 1
7 0.00 0.00 50.00 14
acl_list 24 18 1 1 15.35 0 18 1 1 100 0 0
1 18 0.00 0.00 50.00 36
dlink_node 24 337 8 8 0.52 0 0 0 8 0 337 8 8
1088 0.00 0.00 76.35 1425
FwdServer 24 41 1 6 12.05 0 9 1 6 22 32 1 5
3875752 0.49 0.02 99.87 3880603
HttpReply 168 338927 55606 55606 0.00 1 338922
55605 55606 100 5 1 23 10817354 1.37 0.46 96.17 11248509
HttpHeaderEntry 40 2602377 101656 101656 0.00 1 2602282
101652 101656 100 95 4 101 200511069 25.36 2.03 98.06
204487269
HttpHdrCc 40 27921 1091 1091 0.01 0 27914 1091
1091 100 7 1 3 7294400 0.92 0.07 99.17 7355123
HttpHdrRangeSpec 16 96 2 2 13.43 0 0 0 2 0 96 2 2
25596 0.00 0.00 99.31 25775
HttpHdrRange 16 2 1 1 12.07 0 0 0 1 0 2 1 1
21759 0.00 0.00 99.44 21881
HttpHdrContRange 24 5 1 1 13.46 0 0 0 1 0 5 1 1
10713 0.00 0.00 97.21 11021
intlist 16 3 1 1 15.35 0 3 1 1 100 0 0 1
3 0.00 0.00 50.00 6
MemObject 272 338920 90026 90026 0.00 1 338915
90025 90026 100 5 2 39 5022931 0.64 0.35 93.21 5388889
mem_node 4112 2338883 9392078 9392078 0.00 96
2338871 9392029 9392078 100 12 49 3908 15349368 1.94 15.98
81.20 18902877
relist 48 8 1 1 15.35 0 8 1 1 100 0 0 1
8 0.00 0.00 50.00 16
request_t 1360 341 453 893 12.05 0 326 433 893
96 15 20 496 9456631 1.20 3.26 98.60 9590792
StoreEntry 88 338920 29126 29126 0.00 0 338915
29126 29126 100 5 1 13 5025544 0.64 0.11 93.26 5388889
wordlist 16 14 1 1 15.35 0 11 1 1 79 3 1 1
88 0.00 0.00 86.27 102
ClientInfo 344 28096 9439 14657 12.40 0 15714 5279
14657 56 12382 4160 4852 653528 0.08 0.06 93.63 697962
MD5 digest 16 338920 5296 5296 0.00 0 338915 5296
5296 100 5 1 3 13112283 1.66 0.05 97.40 13462778
helper_request 40 353 14 14 0.52 0 1 1 14 0 352
14 14 4805587 0.61 0.05 99.99 4805971
storeSwapTLV 24 5 1 1 15.32 0 0 0 1 0 5 1 1
47627493 6.02 0.29 100.00 47627498
Short Strings 36 2939320 103336 103336 0.00 1 2939241
103333 103336 100 79 3 75 219162796 27.72 2.00 98.15
223283726
Medium Strings 128 15166 1896 1951 0.52 0 15126 1891
1951 100 40 5 89 50029245 6.33 1.62 99.39 50334410
Long Strings 512 5927 2964 3002 0.52 0 5911 2956
3002 100 16 8 143 9251916 1.17 1.20 99.22 9325056
cbdata acl_access (1) 56 13 1 1 15.35 0 13 1 1 100 0 0
1 13 0.00 0.00 50.00 26
cbdata aclCheck_t (2) 312
daf
1 1 1 15.32 0 0 0 1 0 1 1 1 22575783 2.86 1.78 100.00 22575832
cbdata clientHttpRequest (3) 1096 21 23 481 0.52 0 15 17 481
71 6 7 387 4735510 0.60 1.31 98.21 4821935
cbdata ConnStateData (4) 320 194 61 299 11.98 0 190 60 299
98 4 2 137 2604058 0.33 0.21 97.41 2673373
cbdata ErrorState (5) 160 200 32 32 0.06 0 199 32 32 100
1 1 1 16266 0.00 0.00 98.79 16466
cbdata FwdState (6) 96 31 3 24 12.05 0 9 1 24 29 22 3
20 3871284 0.49 0.09 99.76 3880603
cbdata generic_cbdata (7) 32 30 1 4 15.32 0 0 0 4 0 30
1 4 14750620 1.87 0.12 100.00 14750938
cbdata helper (8) 136 1 1 1 15.35 0 1 1 1 100 0 0 0
0 0.00 0.00 0.00 1
cbdata helper_server (9) 152 16 3 3 15.35 0 16 3 3 100 0
0 3 32 0.00 0.00 66.67 48
cbdata HttpStateData (12) 152 38 6 37 12.05 0 9 2 37 24
29 5 32 3874204 0.49 0.15 99.85 3880194
cbdata ps_state (14) 200 1 1 1 15.32 0 0 0 1 0 1 1 1
3880602 0.49 0.20 100.00 3880603
cbdata RemovalPolicy (15) 104 1 1 1 15.35 0 1 1 1 100 0
0 0 0 0.00 0.00 0.00 1
cbdata store_client (18) 128 167 21 45 12.05 0 133 17 45
80 34 5 29 8643422 1.09 0.28 99.89 8652735
event 48 34 2 3 12.38 0 31 2 3 91 3 1 1
8007262 1.01 0.10 99.96 8010458
cbdata http_port_list (19) 128 2 1 1 14.75 0 2 1 1 100 0
0 0 0 0.00 0.00 0.00 2
cbdata body_size (20) 64 1 1 1 15.35 0 1 1 1 100 0 0 1
1 0.00 0.00 50.00 2
cbdata ConnectStateData (21) 96 16 2 11 15.32 0 0 0 11 0
16 2 11 3826157 0.48 0.09 99.96 3827561
close_handler 24 236 6 31 12.05 0 224 6 31 95 12
1 13 14200692 1.80 0.09 99.57 14261946
ipcache_entry 128 978 123 137 13.32 0 951 119 137
97 27 4 22 611237 0.08 0.02 98.86 618298
fqdncache_entry 160 20 4 4 15.35 0 20 4 4 100 0 0
4 24 0.00 0.00 54.55 44
cbdata idns_query (22) 8680 2 17 941 15.32 0 0 0 941 0 2
17 941 616884 0.08 1.36 99.78 618218
cbdata redirectStateData (23) 72 353 25 25 0.52 0 1 1 25
0 352 25 25 4805595 0.61 0.09 99.99 4805971
cbdata Logfile (26) 1120 3 4 4 15.35 0 3 4 4 100 0 0 4
3 0.00 0.00 50.00 6
pconn_data 32 6 1 1 13.34 0 1 1 1 17 5 1 1
61876 0.01 0.00 99.69 62069
pconn_fds 32 6 1 1 13.34 0 1 1 1 17 5 1 1
61889 0.01 0.00 99.71 62069
cbdata AddVaryState (29) 160 1 1 2 12.38 0 0 0 2 0 1 1
2 305954 0.04 0.01 99.84 306435
cbdata LocateVaryState (30) 136 3 1 1 12.38 0 0 0 1 0 3
1 1 571615 0.07 0.02 99.99 571661
VaryData 32 6 1 1 13.62 0 1 1 1 17 5 1 1
570046 0.07 0.00 99.72 571661
cbdata StoreUpdateState (31) 4152 2 9 41 12.34 0 0 0 41 0
2 9 41 1044654 0.13 1.10 99.71 1047729
Total - 9317346 9794810 9794810 0.00 100
9303255 9790117 9790185 100 14091 4693 5120 776329091 98.19
95.79 98.19 790614788
Cumulative allocated volume: 395.07 GB
Current overhead: 301304 bytes (0.003%)
Idle pool limit: 5.00 MB
memPoolAlloc calls: 790614788
memPoolFree calls: 781311532
String Pool Impact
(%strings) (%volume)
Short Strings 99 96
Medium Strings 1 2
Long Strings 0 3
Other Strings 0 0
Large buffers: 0 (0 KB)
Are we 100% certain this is normal / correct? Cutting the process
size in half would double the process spawn rate. :)
Thanks!