Search squid archive

Re: External helper consumes too many DB connections

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/8/22 09:50, roee klinger wrote:

Alex: If there are a lot more requests than your users/TTLs should
      generate, then you may be able to decrease db load by figuring out
      where the extra requests are coming from.

actually, I don't think it matters much now that I think about it
again, since as per my requirements, I need to reload the cache every
60 seconds, which means that even if it is perfect, MariaDB will
still get a high load. I think the second approach will be better
suited.

Your call. Wiping out the entire authentication cache every 60 seconds feels odd, but I do not know enough about your environment to judge.


Alex: aggregating helper-db connections (helpers can be written to
      talk through a central connection aggregator)


That sounds like exactly what I am looking for, how would one go about doing this?

You have at least two basic options:

A. Enhance Squid to let SMP workers share helpers. I assume that you have C SMP workers and N helpers per worker, with C and N significantly greater than 1. Instead of having N helpers per worker and C*N helpers total, you will have just one concurrent helper per worker and C helpers total. This will be a significant, generally useful improvement that should be officially accepted if implemented well. This enhancement requires serious Squid code modifications in a neglected error-prone area, but it is certainly doable -- Squid already shares rock diskers across workers, for example.

B. Convert your helper from a database client program to an Aggregator client program (and write the Aggregator). Depending on your needs and skill, you can use TCP or Unix Domain Sockets (UDS) for helper-Aggregator communication. The Aggregator may look very similar to the current helper, except it will not use stdin/stdout for receiving/sending helper queries/responses. This option also requires development, but it is much simpler than option A.


HTH,

Alex.


On Tue, Feb 8, 2022 at 4:41 PM Alex Rousskov wrote:

    On 2/8/22 09:13, roee klinger wrote:

     > I am running multiple instances of Squid in a K8S environment, each
     > Squid instance has a helper that authenticates users based on their
     > username and password, the scripts are written in Python.
     >
     > I have been facing an issue, that when under load, the helpers (even
     > with 3600 sec TTL) swamp the MariaDB instance, causing it to
    reach 100%
     > CPU, basically I believe because each helper opens up its own
    connection
     > to MariaDB, which ends up as a lot of connections.
     >
     > My initial idea was to create a Redis DB next to each Squid
    instance and
     > connect each Squid to its own dedicated Redis. I will sync Redis
    with
     > MariaDB every minute, thus decreasing the connections count from
    a few
     > 100s to just 1 every minute. This will also improve speeds since
    Redis
     > is much faster than MariaDB.
     >
     > The problem is, however, that there will still be many
    connections from
     > Squid to Redis, and I probably that will consume a lot of DB
    resources
     > as well, which I don't actually know how to optimize, since it seems
     > that Squid opens many processes, and there is no way to get them
    to talk
     > to each other (expect TTL values, which seems not to help in my
    case,
     > which I also don't understand why that is).
     >
     > What is the best practice to handle this? considering I have the
     > following requirements:
     >
     >     1. Fast
     >     2. Refresh data every minute
     >     3. Consume as least amount of DB resources as possible

    I would start from the beginning: Does the aggregate number of database
    requests match your expectations? In other words, do you see lots of
    database requests that should not be there given your user access
    patterns and authentication TTLs? In yet other words, are there many
    repeated authentication accesses that should have been authentication
    cache hits?

    If there are a lot more requests than your users/TTLs should generate,
    then you may be able to decrease db load by figuring out where the
    extra
    requests are coming from. For example, it is possible that your
    authentication cache key includes some noise that renders caching
    ineffective (e.g., see comments about key_extras in
    squid.conf.documented). Or maybe you need a bigger authentication cache.

    If the total stream of authentication requests during peak hours is
    reasonable, with few unwarranted cache misses, then you can start
    working on aggregating helper-db connections (helpers can be written to
    talk through a central connection aggregator) and/or adding database
    power (e.g., by introducing additional databases running on previously
    unused hardware -- just like your MariaDB idea).


    Cheers,

    Alex.


_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users




[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux