Search squid archive

Re: Proxy server to support a large number of simultaneous requests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/31/23 02:56, Andrey K wrote:

 > Do you get close to 100% hit ratio if clients access these URLs
 > sequentially rather than concurrently? If not, then focus on that
 > problem before you open the collapsed forwarding Pandora box.
When I run curl sequentially like this:
for i in `seq 500`; do curl --tlsv1.2 -k   --proxy 0001vsg01:3131  -v $URL  >/dev/null 2>&1; done I get only the first request with a status TCP_MISS and all others with TCP_MEM_HIT:
     Cnt Status            Parent
   499 TCP_MEM_HIT/200/- HIER_NONE/-
       1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy

Excellent. This confirms that your Squid can successfully cache this object (in memory).


It is interesting to note that on both squid versions if I run a separate curl after processing 500 or 200 concurrent threads, I get a result with the status TCP_MISS/200

The next step I would recommend is to study the very first cache miss _after_ the 500 or 200 concurrent threads test. Doing so may shed light on why Squid is refusing to serve that (presumably cached) object from the cache. I suspect that the object was marked for deletion earlier, but we should check before spending more time on more complex triage of concurrent cases. If you can share a (link to) compressed ALL,9 cache.log from that single transaction against Squid v6, I may be able to help you with that step.


Cheers,

Alex.


 > What is your Squid version? Older Squids have more collapsed forwarding
 > bugs than newer ones. I recommend testing with Squid v6 or master/v7, at
 > least to confirm that the problem is still present in the latest
 > official code.
I run tests on SQUID 5.9.
We compiled 6.0.2 (with disabled delay-pools) and increased memory parameters:
   cache_mem 2048 MB
   maximum_object_size_in_memory 10 MB
The complete configuration is shown below.

Now on the version 6.0.2 we have the next results:
500 threads -  Hit ratio 3.8%:
       3 TCP_CF_HIT/200/- HIER_NONE/-
       2 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
      16 TCP_HIT/200/- HIER_NONE/-
     467 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
      12 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy

200 threads - 6%
       6 TCP_CF_HIT/200/- HIER_NONE/-
      10 TCP_HIT/200/- HIER_NONE/-
     176 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
       8 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy

50 threads - 82%
      30 TCP_CF_HIT/200/- HIER_NONE/-
      11 TCP_HIT/200/- HIER_NONE/-
       1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
       8 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy

The results are slightly worse than they were on the version 5.9.
It is interesting to note that on both squid versions if I run a separate curl after processing 500 or 200 concurrent threads, I get a result with the status TCP_MISS/200, although the requested URL is already in the rock cache (I can see it in the contents of the cache using the utility I developed rock_cache_dump.pl <http://rock_cache_dump.pl>:
$VAR1 = {
           '1' => {
                    'VERSION' => 'Wed May 31 09:18:05 2023',
                    'KEY_MD5' => 'e5eb10f0ab7d84ff9d3fd1e5a6d3eb9c',
                    'OBJSIZE' => 446985,
                    'STD_LFS' => {
                                   'lastref' => 'Wed May 31 09:18:05 2023',
                                   'flags' => '0x4004',
                                   'expires' => 'Wed May 31 15:18:05 2023',
                                   'swap_file_sz' => 0,
                                   'refcount' => 1,
                                   'lastmod' => 'Wed Jun 29 16:09:14 2016',
                                   'timestamp' => 'Wed May 31 09:18:05 2023'
                                 },
                   'URL' => 'https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>'
                  }
         };

).

 > How much RAM does your server have? You are using default 256MB memory
 > cache (cache_mem). If you have spare memory, make your memory cache much
 > larger: A rock cache_dir cannot (yet) share the response _while_ the
 > response is being written to disk, so relying on cache_dir too much will
> decrease your hit ratio, especially in a collapsed forwarding environment.
The VM has 32 GB RAM. I configured cache_mem 2048 MB on the 6.0.2 version.

 > Is your Squid built with --enable-delay-pools? If yes, TCP_MISS does not
 > necessarily mean a cache miss (an old Squid bug), even if you do not use
 > any delay pools.
Yes, delay pools on the version 5.9 were enabled though we don't use them. I disabled this feature on the 6.0.2 version.


 > Since you are trying to cache objects lager than 512KB, see
 > maximum_object_size_in_memory.
I configured maximum_object_size_in_memory 10 MB on the 6.0.2 version (as videochunks are less than 7 MB).

 > Consider making your test much longer (more sequential requests per
 > client/curl worker), to see whether the cache becomes "stable" after one
 > of the first transactions manages to fully cache the response. This may
 > not help with older Squids, but might help with newer ones. However, you
 > should not test using real origin servers (that you do not control)!
I don't have any of my own web servers for tests, so I choose some resources on the public internet that have a robust infrastructure.
I will conduct the longer tests next week.

Kind regards,
       Ankor.

*squid.conf*
workers 21

sslcrtd_program /data/squid.user/usr/lib/squid/security_file_certgen -s /data/squid.user/var/lib/squid/ssl_db -M 20MB
sslcrtd_children 21

logformat extended-squid %{%Y-%m-%d %H:%M:%S}tl| %6tr %>a %Ss/%03>Hs/%<Hs %<st %rm %ru %un %Sh/%<A %mt %ea

logfile_rotate 0
access_log daemon:/var/log/squid.user/access.log logformat=extended-squid on-error=drop

cache_peer parent_proxy  parent 3128 0
never_direct allow all

cachemgr_passwd pass config

acl PURGE method PURGE
http_access allow PURGE

http_access allow all

http_port 3131 ssl-bump generate-host-certificates=on dynamic_cert_mem_cache_size=20MB tls-cert=/etc/squid.user/sslbump/bump.crt tls-key=/etc/squid.user/sslbump/bump.key
sslproxy_cert_error allow all

acl step1 at_step SslBump1
acl step2 at_step SslBump2
acl step3 at_step SslBump3

ssl_bump peek step1
ssl_bump bump step2
ssl_bump bump step3

cache_dir rock /data/squid.user/cache 20000 max-size=12000000
cache_swap_low 85
cache_swap_high 90

collapsed_forwarding on
cache_mem 2048 MB
maximum_object_size_in_memory 10 MB

pinger_enable off
max_filedesc 8192
shutdown_lifetime 5 seconds
netdb_filename none
log_icp_queries off

via off
forwarded_for delete

client_request_buffer_max_size 100 MB

coredump_dir /data/squid.user/var/cache/squid




пн, 29 мая 2023 г. в 23:17, Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx <mailto:rousskov@xxxxxxxxxxxxxxxxxxxxxxx>>:

    On 5/29/23 10:43, Andrey K wrote:

     > We need to configure a dedicated proxy server to provide caching of
     > online video broadcasts in order to reduce the load on the uplink
    proxy.
     > Hundreds of users will access the same video-chunks simultaneously.
     >
     > I developed a simple configuration for the test purposes (it is
    shown
     > below).
     > The *collapsed_forwarding* option is on.

    Do you get close to 100% hit ratio if clients access these URLs
    sequentially rather than concurrently? If not, then focus on that
    problem before you open the collapsed forwarding Pandora box.

    What is your Squid version? Older Squids have more collapsed forwarding
    bugs than newer ones. I recommend testing with Squid v6 or
    master/v7, at
    least to confirm that the problem is still present in the latest
    official code.

    How much RAM does your server have? You are using default 256MB memory
    cache (cache_mem). If you have spare memory, make your memory cache
    much
    larger: A rock cache_dir cannot (yet) share the response _while_ the
    response is being written to disk, so relying on cache_dir too much
    will
    decrease your hit ratio, especially in a collapsed forwarding
    environment.

    Is your Squid built with --enable-delay-pools? If yes, TCP_MISS does
    not
    necessarily mean a cache miss (an old Squid bug), even if you do not
    use
    any delay pools.

    Since you are trying to cache objects lager than 512KB, see
    maximum_object_size_in_memory.

    Consider making your test much longer (more sequential requests per
    client/curl worker), to see whether the cache becomes "stable" after
    one
    of the first transactions manages to fully cache the response. This may
    not help with older Squids, but might help with newer ones. However,
    you
    should not test using real origin servers (that you do not control)!


     > Could you clarify if this behavior of my squid is
     > a bug/misconfiguration, or if I'm running into server performance
     > limitations (squid is running on a VM with 22 cores)?

    Most likely, reduction of hit ratio with increase of concurrency is
    _not_ a performance limitation.


    HTH,

    Alex.


     > I selected a couple of cacheable resources in the internet for
    testing:
     >   - small size (~400 KB):
     >
    https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf> <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>>
     >   - large (~8 MB):
     >
    https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>
     > To test simultaneous connections I am forking curl using a simple
    script
     > (it is also shown below).
     >
     > When I run a test (500 curl threads to
     >
    https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf> <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf <https://ia800406.us.archive.org/13/items/romeo-y-julieta-texto-completo/Romeo%20y%20Julieta%20-%20William%20Shakespeare.pdf>>) I see lots of TCP_MISS/200 with FIRSTUP_PARENT/parent_proxy records in the logs.
     >
     > A simple analysis shows a low percentage of cache hits:
     > cat /var/log/squid.user/access.log| grep '2023-05-29 14' | grep
    pdf  |
     > awk '{print $5" " $10}' | sort | uniq -c
     >       24 TCP_CF_MISS/200/- HIER_NONE/-
     >      457 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >       10 TCP_MISS/200/- HIER_NONE/-
     >        9 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > So the Hit ratio is about (500-457-9)*100/500=6.8%
     >
     > Almost the same situation we see when run 200 threads:
     > cat /var/log/squid.user/access.log| grep '2023-05-29 15:45' |
    grep pdf
     >   | awk '{print $5" " $10}' | sort | uniq -c
     >        4 TCP_CF_MISS/200/- HIER_NONE/-
     >      140 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >       40 TCP_MISS/200/- HIER_NONE/-
     >       16 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > This time the Hit ratio is about (200-140-16)*100/500=21%
     >
     > With 50 threads the Hit ratio is 90%:
     > cat /var/log/squid.user/access.log| grep '2023-05-29 15:50' |
    grep pdf
     >   | awk '{print $5" " $10}' | sort | uniq -c
     >       27 TCP_CF_MISS/200/- HIER_NONE/-
     >        1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >       18 TCP_MISS/200/- HIER_NONE/-
     >        4 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > I thought that it should always be near 99% - only the first
    request to
     > an URL should be forwarded to the parent proxy and all subsequent
     > requests should be served from the cache.
     >
     > The situation is even worse with downloading a large file:
     > 500 threads (0.4%):
     > cat /var/log/squid.user/access.log| grep '2023-05-29 17:2' | grep
    pdf  |
     > awk '{print $5" " $10}' | sort | uniq -c
     >       10 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >        2 TCP_CF_MISS/200/- HIER_NONE/-
     >      488 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > 200 threads (3%):
     > cat /var/log/squid.user/access.log| grep '2023-05-29 17:3' | grep
    pdf  |
     > awk '{print $5" " $10}' | sort | uniq -c
     >        9 TCP_CF_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >        6 TCP_CF_MISS/200/- HIER_NONE/-
     >      180 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >        5 TCP_SWAPFAIL_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > 50 threads (98%):
     > cat /var/log/squid.user/access.log| grep '2023-05-29 17:36' |
    grep pdf
     >   | awk '{print $5" " $10}' | sort | uniq -c
     >       25 TCP_CF_HIT/200/- HIER_NONE/-
     >       12 TCP_CF_MISS/200/- HIER_NONE/-
     >       12 TCP_HIT/200/- HIER_NONE/-
     >        1 TCP_MISS/200/200 FIRSTUP_PARENT/parent_proxy
     >
     > Could you clarify if this behavior of my squid is a
     > bug/misconfiguration, or if I'm running into server performance
     > limitations (squid is running on a VM with 22 cores)?
     >
     > Kind regards,
     >       Ankor
     >
     >
     >
     > *squid.conf:*
     > workers 21
     >
     > sslcrtd_program
    /data/squid.user/usr/lib/squid/security_file_certgen -s
     > /data/squid.user/var/lib/squid/ssl_db -M 20MB
     > sslcrtd_children 21
     >
     > logformat extended-squid %{%Y-%m-%d %H:%M:%S}tl| %6tr %>a
     > %Ss/%03>Hs/%<Hs %<st %rm %ru %un %Sh/%<A %mt %ea
     >
     > logfile_rotate 0
     > access_log daemon:/var/log/squid.user/access.log
     > logformat=extended-squid on-error=drop
     >
     > cache_peer parent_proxy  parent 3128 0
     > never_direct allow all
     >
     > cachemgr_passwd pass config
     >
     > acl PURGE method PURGE
     > http_access allow PURGE
     >
     > http_access allow all
     >
     > http_port 3131 ssl-bump generate-host-certificates=on
     > dynamic_cert_mem_cache_size=20MB
     > tls-cert=/etc/squid.user/sslbump/bump.crt
     > tls-key=/etc/squid.user/sslbump/bump.key
     > sslproxy_cert_error allow all
     >
     > acl step1 at_step SslBump1
     > acl step2 at_step SslBump2
     > acl step3 at_step SslBump3
     >
     > ssl_bump peek step1
     > ssl_bump bump step2
     > ssl_bump bump step3
     >
     > cache_dir rock /data/squid.user/cache 20000 max-size=12000000
     > cache_swap_low 85
     > cache_swap_high 90
     >
     > *collapsed_forwarding on*
     >
     > pinger_enable off
     > max_filedesc 8192
     > shutdown_lifetime 5 seconds
     > netdb_filename none
     > log_icp_queries off
     > client_request_buffer_max_size 100 MB
     >
     > via off
     > forwarded_for delete
     >
     > coredump_dir /data/squid.user/var/cache/squid
     >
     > *curl_forker.sh:*
     > #!/bin/sh
     > N=100
     >
    URL=https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>
     >
     > if [[  -n $1 &&  $1 =~ help$  ]];
     > then
     >     echo "Usage: $0 [<cnt>] [<url>]"
     >     echo
     >     echo "Example: $0 10
     >
    https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf> <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf <https://ia600601.us.archive.org/10/items/Linux-Journal-2015-01/Linux-Journal-2015-01.pdf>>";
     >     echo
     >     exit;
     > fi
     >
     > while [[ $# -gt 0 ]]
     > do
     >    if [[ $1 =~ ^[0-9]+$ ]]
     >    then
     >       N=$1
     >    else
     >       URL=$1
     >    fi
     >    shift
     > done
     >
     > echo $URL
     > echo $N threads
     >
     > for i in `seq $N`
     > do
     >    nohup curl --tlsv1.2 -k   --proxy 0001vsg01:3131  -v $URL
      >/dev/null
     >   2>&1 &
     > done
     >
     >
     >
     >
     >
     >
     >
     >
     >
     >
     >
     > _______________________________________________
     > squid-users mailing list
     > squid-users@xxxxxxxxxxxxxxxxxxxxx
    <mailto:squid-users@xxxxxxxxxxxxxxxxxxxxx>
     > http://lists.squid-cache.org/listinfo/squid-users
    <http://lists.squid-cache.org/listinfo/squid-users>

    _______________________________________________
    squid-users mailing list
    squid-users@xxxxxxxxxxxxxxxxxxxxx
    <mailto:squid-users@xxxxxxxxxxxxxxxxxxxxx>
    http://lists.squid-cache.org/listinfo/squid-users
    <http://lists.squid-cache.org/listinfo/squid-users>


_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux