RE: Help Needed: Any suggestion on performancedowngrade after enable Cache Digest?

"Zhou, Bo\(Bram\)" <bramzhou@xxxxxxxxxxxxx> · Mon, 21 Apr 2008 23:07:20 +0800

Alex,

Thanks for your quick response and good suggestion. You are definitely right
that I'm testing with a useless-digest Squid but with high cpu utilization
which I did not expect. I will do more testing and profiling with modified
Squid as you suggested in the next coming days, collect more data other than
cpu utilization. Thanks again.

Best Regards,
Bo Zhou

>> -----Original Message-----
>> From: Alex Rousskov [mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx]
>> Sent: Monday, April 21, 2008 10:31 PM
>> To: Zhou, Bo(Bram)
>> Cc: squid-users@xxxxxxxxxxxxxxx
>> Subject: Re:  Help Needed: Any suggestion on
>> performancedowngrade after enable Cache Digest?
>> 
>> On Mon, 2008-04-21 at 18:48 +0800, Zhou, Bo(Bram) wrote:
>> 
>> > Recently I did some interesting performance testing on the Squid
configured
>> > with Cache Digest Enabled. The testing result shows that the Squid use
more
>> > than 20% CPU time than the Squid running without Cache Digest.
>> 
>> Thank you for posting the results with a rather detailed description
>> (please consider also posting Polygraph workload next time you do this).
>> 
>> > Following are
>> > my detailed testing environment and configuration and result. Anyone
can
>> > give me some light on the possible reason will be greatly appreciated.
>> 
>> Besides fetching peer digests and rebuilding the local digest, using
>> digests requires Squid to do the following for each "cachable" request:
>> - compute the digest key (should be cheap)
>> - lookup the digest hash tables (should be cheap for one peer)
>> - for CD hits, ask the peer for the object (expensive)
>> - update the digest (cheap)
>> 
>> As far as I understand, your test setup measured the sum of "cheap"
>> overheads but did not measure the expensive part. Perhaps more
>> importantly, the test did not allow for any cache digest hits so you are
>> comparing no-digest Squid with a useless-digest Squid. It would be nice
>> if you run a test where all Polygraph Robots request URLs that can be in
>> both peer caches and where a hit has much lower response time (because
>> there is no artificial server-side delay). Depending on your hit ratio
>> and other factors, you may see significant overall improvement despite
>> the overheads (otherwise peering would be useless).
>> 
>> 
>> I am not a big fan of CPU utilization as the primary measurement because
>> it can bite you if the program does more "select" loops than needed when
>> not fully loaded. I would recommend focusing on response time while
>> using CPU utilization as an internal/secondary measurement. However,
>> let's assume that in you particular tests CPU utilization is a good
>> metric (it can be!).
>> 
>> 
>> 20% CPU utilization increase is more than I would expect if there are no
>> peer queries. On the other hand, you also report 30% CPU increase when
>> two peers are busy (test1 versus test2). Thus, your test precision
>> itself can be within that 20% bracket. It would be interesting to see
>> test4 with one busy and one idle no-digest proxy.
>> 
>> If you can modify the code a little, it should be fairly easy to isolate
>> the core reason for the CPU utilization increase compared to a no-digest
>> SquidProfiling may lead to similar results.
>> 
>> For example, I would disable all digest lookups (return "not found"
>> immediately) and local updates (do nothing) to make sure the CPU
>> utilization matches that of a no-digests tests. If CPU usage in that
>> test goes down about 20%, the next step would be to check whether it is
>> the lookup, the updates, or both. I would leave the lookup off but
>> reenable the updates and see what happens. Again, profiling may allow
>> you to do similar preliminary analysis without rerunning the test.
>> 
>> HTH,
>> 
>> Alex.
>> 
>> 
>> > Please also point out the possible configuration errors if any. Thanks
a
>> > lot.
>> >
>> > 1. Hardware configuration : HP DL380
>> > (1) Squid Server
>> > CPU: 2 Xeon 2.8GHz CPUs, each Xeon CPU has 2 Cores
>> > Memory size: 6G, Disk: 36G, NIC: 1000M
>> > (2) Client and Web Server : Dell Vostro200 running with Web Polygraph
3.1.5
>> >
>> > 2. Squid Configuration
>> > (1) 2 Squid instances are running on the same HP server, each using
same
>> IP
>> > address but different PORT, pure in memory cache
>> > Squid1 configuration:
>> > http_port 8081
>> > cache_mem 1024 MB
>> > cache_dir null /tmp
>> > cache_peer 192.168.10.2		sibling   8082  0     proxy-only
>> > digest_generation on
>> > digest_bits_per_entry 5
>> > digest_rebuild_period 1 hour
>> > digest_swapout_chunk_size 4096 bytes
>> > digest_rebuild_chunk_percentage 10
>> >
>> > Squid2 configuration:
>> > http_port 8082
>> > cache_mem 1024 MB
>> > cache_dir null /tmp
>> > cache_peer 192.168.10.2		sibling   8081  0     proxy-only
>> > digest_generation on
>> > digest_bits_per_entry 5
>> > digest_rebuild_period 1 hour
>> > digest_swapout_chunk_size 4096 bytes
>> > digest_rebuild_chunk_percentage 10
>> >
>> > 3. 2 Polygraph Clients are used to send HTTP requests to Squid
instances.
>> > Different client send request to different Squid instance. Each client
>> > configures 1000 users with 1.2 request/s, so totally each client send
1200
>> > requests/s.
>> >
>> > 4. Test result (Note: since 4 CPU used on the server, the total CPU
>> > utilization is 400%)
>> > (1) Running 2 Squid instances with Cache Digest Enabled, each handles
1200
>> > request/second:
>> > Each instance used ~95% CPU even during the time Squid didn't rebuild
the
>> > digest
>> >
>> > (2) Running 2 Squid instances with Cache Digest Enabled, one handles
1200
>> > request/second, one is idle(no traffic to it)
>> > The one with traffic has CPU utilization ~65%, the other one is idle
>> >
>> > (3) Running 2 Squid instances with Cache Digest Disabled, each handles
1200
>> > request/second:
>> > Each instance used ~75% CPU
>> >
>> >
>> > Best Regards,
>> > Bo Zhou
>> >
>>