On 2014-02-04 00:39, Henrik Lidström wrote:
On 01/29/14 22:10, Nikolai Gorchilov wrote:
Can somebody reproduce the problem? Or I'm the only one who have it?
Best,
Niki
P.S. Made some additional testing with diskd storage - no problems
detected. IPC seems to work as expected in diskd mode.
On Wed, Jan 29, 2014 at 12:26 PM, Nikolai Gorchilov <niki@xxxxxxxx>
wrote:
I just made a new discovery - the problem seemingly disappears when
running Squid with -N, thus removing the disker process out of the
equation.
Could it be a worker-disker IPC related issue, instead of rock store
specific problem?
Best,
Niki
On Wed, Jan 29, 2014 at 12:00 PM, Nikolai Gorchilov <niki@xxxxxxxx>
wrote:
Dear Amos
On Wed, Jan 29, 2014 at 5:18 AM, Amos Jeffries
<squid3@xxxxxxxxxxxxx> wrote:
The simple steps to reproduce:
1. Empty store dir and recreate swap with -z
Did you leave sufficient time after doing this to ensure that the
-z
operation was completed successfully and all the "squid -z" spawned
processes stopped?
Yes.
2. Start Squid with the above config
3. Open a new private browser session (or clear browser cache) and
request www.w3.org. Most of the 35+ requested items will get
cached.
4. Once the page loads fully, request full reload
Ctrl-Shift-R/Cmd-Shift-R
Step 4 operation is blocked by the "reload_into_ims on" config
directive
which turns these force-reload (Cache-Control:no-cache) requests
into
If-Modified-Since (IMS) requests.
Under correct protocol operation they should be TCP_MISS responses.
Exactly. This problem exists only when delivering TCP_HITs.
TCP_REFRESH_UNMODIFIED for instance works fine.
As mentioned in conf comments reload_into_ims is used only to make
the
problem reproduction easier - having multiple TCP_HITs in short
period
of time somehow trigger it.
Same scenario works perfectly fine when using aufs.
During this (and every next) reload few (1-3-5) random requests
out of
all 35+ objects will enter into this strange state, until the user
aborts the connection.
Access.log excerpts:
rock: http://pastebin.com/QWTgqRTD
aufs: http://pastebin.com/0WGdHPP9
Pay attention to '===[]===' marks in the logs, marking user's
actions.
Please help! Can you reproduce the problem or it's somehow related
to
my environment?
I note that you are only waiting 5-10 seconds between teh refresh
and
escape. While this may be long enough, does the trace show any
difference if you leave it for a much longer time such as 30 or 60
seconds?
I have seen connections in that state waiting for hours (say 1
night).
Just did 30 minutes test:
1390989136.532 1910610 172.16.101.252 TCP_HIT_ABORTED/000 0 GET
http://www.w3.org/2008/site/css/minimum - HIER_NONE/- -
Checked active_connections for this same connection just before
hitting ESC. Here's what Squid reports:
Connection: 0x28e8be8
FD 22, read 322, wrote 0
FD desc: Reading next request
in: buf 0x28eb9b0, offset 0, size 4096
remote: 172.16.101.252:58477
local: 192.168.1.10:3128
nrequests: 1
uri http://www.w3.org/2008/site/css/minimum
logType TCP_HIT
out.offset 0, out.size 0
req_sz 322
entry 0x2935d50/DC7BF6E1F61DA763BBCBD0043D4281A4
start 1390987225.923229 (1886.649296 seconds ago)
username -
Hope this helps!
Best,
Niki
Hi there.
Noticed the same thing when testing 3.HEAD-20140127-r13248 with large
rock support this weekend at home. Connections stalling forever after a
while , until ESC (stop browser) as you say. This is on FreeBSD
9.2-STABLE running aufs usually, so no IPC/SHM extra configuration had
been done. Reverted to aufs, cause no time to do further analysis.
ex:
1391183222.450 293652 192.168.2.107 TCP_HIT_ABORTED/000 0 GET
http://s.ytimg.com/yts/jsbin/www-vflZWeAO9/www_common_mod.js -
HIER_NONE/- -
/Henrik
This is being discussed in squid-dev mailing list thread "Re: Regression
introduced in trunk r13201 (large rock merge)" (I have cc'd you on my
reply there).
PS. Problems with "alpha" release code (AKA. 3.HEAD, trunk, and other
devel bzr branches) are actually off-topic here. Please followup in the
squid-dev thread.
Amos