Re: Sockets not closed after ICAP receivedWholeAdaptedReply

Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> · Wed, 9 Aug 2023 16:08:03 -0400

On 8/9/23 14:33, mailing@xxxxxxxxx wrote:

Do you have any remarks on the change? Maybe you have some more 
insight on the interplay of the code or what seems to be causing the
issue in detail.


Hi Leonardo,

    I do, but this mailing list is not the best place to discuss Squid 
code. If you want these or similar changes merged into Squid, then I 
suggest posting a draft Pull Request on GitHub, where it is easier to 
collaborate and provide code-specific feedback. Ideally, use a dedicated 
git branch that Squid core developers can modify and experiment with 
(e.g., do _not_ use a cloned official branch like "master" or a custom 
branch that you automatically deploy!).
https://wiki.squid-cache.org/MergeProcedure#pull-request


For now, I will leave you with a few questions and brief comments:

* Does the problem go away if you remove the patch in PR #1400? Or are 
you unable to test this use case without that patch?

* Please clarify whether the HTP response encapsulated in the ICAP 
response should set ENTRY_BAD_LENGTH in a bug-free Squid code. It was 
not clear (to me) whether there is something wrong with that HTTP response.

* ICAP Connection and HTTP Connection headers in ICAP responses should 
not affect encapsulated HTTP response length. If they do, there is a bug 
somewhere.

* If your patch is the right thing to do, then, most likely, more 
changes are required. For example, a similar condition may happen on the 
ConnStateData::afterClientRead() code path AFAICT.

* PR 1443 has overlapping changes. They will not fix your specific use 
case, but, ideally, I would prefer to land that (IMO, ready) PR before 
attacking this problem because that PR has relevant code changes that we 
should not revisit in a parallel PR.


HTH,

Alex.


On 8/9/23 14:33, mailing@xxxxxxxxx wrote:
we have been having some issues lately with our Squid proxies on version 
6.1.

We also have the patch for the REQMOD satisfaction regression installed: 
https://github.com/squid-cache/squid/pull/1400 

The issue:

When there is a request sent to the ICAP server and the ICAP server is 
replying with a modified response, this modified response, after 
checking through receivedWholeAdeptedReply, doesn't set a BAD_LENGTH, 
which usually turns into a STREAM_UNPLANNED_COMPLETE, but rather a 
STREAM_COMPLETE (in Http::Stream::writeComplete). This in turn calls 
ConnStateData::kick on the current context, which has an empty pipeline 
at some point. Because this case is not specifically checked and the 
connection is just abandoned, the socket is never removed. This causes a 
lot of open connections and a high amount of file descriptors for the 
corresponding squid processes that we frequently run into limits.

As a hot fix I made the following changes in order to accommodate for 
this specific case:

--- a/src/client_side.cc
+++ a/src/client_side.cc
@@ -989,6 +989,9 @@ ConnStateData::kick()
      } else if (flags.readMore) {
          debugs(33, 3, clientConnection << ": calling readNextRequest()");
          readNextRequest();
+    } else if (pipeline.empty()) {
+        debugs(33, 3, "pipeline is empty - closing connection");
+        clientConnection->close();
      } else {
          // XXX: Can this happen? CONNECT tunnels have deferredRequest set.
          debugs(33, DBG_IMPORTANT, MYNAME << "abandoning " << 
clientConnection);

This issue also seems to happen if the ICAP server has specific 
responses - it seems as if the STREAM_COMPLETE case happens (implying 
receivedWholeAdaptedReply) if there is a Connection: closed response 
from ICAP, rather than Connection: keep-alive. In the latter case, the 
connection is torn down with a STREAM_UNPLANNED_COMPLETE. The patch, 
setting receivedWholeAdaptedReply always to true in those cases also 
means that the connection is never closed and abandoned all the time if 
it includes REQMOD it seems, rather than just doing that if the response 
from ICAP implies a closed connection.

I was not able to reproduce the issue with version 5.8 as of now, so 
this seems to be something specific to 6.1.

Do you have any remarks on the change? Maybe you have some more insight 
on the interplay of the code or what seems to be causing the issue in 
detail.

Sincerely,

Leonardo Martinho

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users