Re: SIGTERM SIGKILL causes issues with squid shutdown during reboot

Stanford Prescott <stan.prescott@xxxxxxxxx> · Fri, 24 Jul 2015 10:23:15 -0500

Thanks, Amos. That was very helpful. Smoothwall Express does not and never has used systemd, precisely because of the reasons you mention. It does use udev and we are considering bumping to eudev, but that is a fairly large change, but likely worth it.

We have some things to think about now with possibly redesigning the system shutdown.

Regards.

Stan

On Fri, Jul 24, 2015 at 8:10 AM, Amos Jeffries <squid3@xxxxxxxxxxxxx> wrote:
Response inline.

On 24/07/2015 8:21 a.m., Stanford Prescott wrote:

> After bumping Squid from 3.4.x to 3.5.x in our implementation of Squid in

> the Smoothwall Express v3.1 firewall distro we have begun to have

> complaints from our users about "erratic behavior" of Squid shutting down

> during reboots or network drops causing reboots.

>

> It appears that squid (v3.5.[5-6]) does not respond well to SIGTERM during

> system shutdown; the cache index almost invariably needs to be rebuilt on

> next boot. It is suggested that we use squid to shut squid down. While

> using squid to stop the squid daemon is very doable, this requirement runs

> contrary to the longstanding, traditional UNIX method of "SIGTERM, pause,

> SIGKILL".

Squid is designed to operate in UNIX style. It contains its own daemon

manager, and a worker process.

When SIGTERM is received by the manager process, it relays that to the

worker which begins shutdown immediately but with shutdown_lifetime

grace period for active clients to finish. A second SIGTERM will act

like shutdown_lifetime having expired, everything gets closed immediately.

SIGKILL causes the OS init system to erase the Squid processes from

system memory, aborts all the open sockets and disk I/O, etc. The

nuclear option as some often called it.

This causes issues with systemd, openrc, or upstart where the init

system insists on being a daemon manager itself. IIRC there was an

excellent tutorial somewhere (by DJB?) that explained in great detail

why having a daemon manager to operate other daemon managers was a

terrible idea. Squid hits pretty much all the relevant nasty side

effects as each piece of the system makes bad assumptions or gets

confused about other bits activities.

Meanwhile if just left alone the worker is still trying to gracefully

close client transactions for shutdown_lifetime (30sec default). Then

abort those and save the final cache index to a new swap.state file on disk.

>

> During normal system operation, squid *ought* to be able to take as much

> time as it wants to shut down. But it still shouldn't take more than 10-20

> seconds; 'shutdown' is a command, not a request to be honored at squid's

> leisure. After all, the CPU could be on fire....

>

> This raises a few questions that are intended to foster fresh discussion,

> not to re-hash old arguments. They are really more rhetorical in nature;

> the goal is to find the root cause of the problem.

>

>    - Why do these latest versions of Squid 3 behave oddly in this respect?

systemd is becoming popular. See above. Impatience and incorrect init

scripts using SIGKIL instead of SIGTERM does seem to be the main pathway

to having issues.

Squid-3 has become a SMP multi-process thing. Which increases the number

of workers, and things they are still doing during shtutdown.

Squid-3.1 & 3.2 added various checksums to swap.state format protecting

against file corruption propigating into the memory index of next

started Squid process.

>    - What is it about shutting squid down that corrupts the cache index?

"corruption" in the cache index means the swap.state file for each cache

has not been completely written. On next load the file checksum does not

match the contents checksum. So reverts to a relatively slow scan of the

disk to see what is there.

SIGKILL being used during the time that index is being written to disk

FD will cause it nearly 100% of the time. Being used before old

swap.state is replaced will lead to a lot of SWAPFAIL as objects on disk

today dont match what was on disk when Squid restarted a week ago.

What corruption happens also varies between disk controllers. Some

controllers complete already scheduled writes. Some drop them. SIGKILL

may be relevant in that decision too. The ones that complete all

scheduled writes and close the file hit corruption less often, or less

badly.

>    - Does it take more than a few seconds to write the index to disk?

No. But it can take days for active clients to finish off long-polling

HTTP requests or CONNECT tunnels. shutdown_lifetime is what they are

allowed. IME, Squid workers usually completely exit within 5 seconds of

that grace period being over. It takes under a ms to close a socket, but

many tens of thousands of sockets may be open on a busy proxy.

>    - Does squid use the very slow 'writethrough' method instead of trusting

>    the Linux disk cache to properly save the cache?

Squid is generating new content and writing it to a file with write(2)

API. That content is anything up to 68MB in size *per cache_dir*. How

fast can your disk save that much data when its passed in ~100 byte IO

chunks?

 Whatever OS filesystem does to optimize I/O happens on top of that

process-level activity.

As mentioned above SIGKILL also does not shutdown the process (that

would be SIGTERM). Just drops it all on the floor. Using SIGKILL on a

process currently writing to disk has varying results from the disk

driver layers - usually not good.

SIGTERM will almost guarantee the files complete writing and things get

gracefully resolved.

If you haven't guessed already IMO you should at worst use SIGTERM twice

on Squid instead of SIGKILL even once. If Squid is actually operational

a second SIGTERM will have the same effect as shutdown_lifetime having

finished early.

Sadly systemd and init scripts can be rather friendly with SIGKILL.

Which can be problematic.

>    - Should squid write the cache index to disk more often?

The cache index is extremely volatile. Every single request through it

adds or removes an entry. Which is a lot of disk overhead on a very busy

proxy.

The compromise is to write it only once on shutdown/restart when all

updates have been finished. Or on "-k rotate" (SIGHUP) when explicitly

asked to auto-save all state for log file rotations.

>    - Should squid track its own 'dirty pages' in its in-core index to

>    reduce the time it takes to write the index?

There is not much to track. The "pages" as it were are little marker

records a few hundred bytes stating what HTTP message is contained in

any given on-disk file. The memory copy gets erased as soon as the disk

controller has been asked to delete the file. The running worker does

not track whether the delete worked or not, just that all index entries

are supposedly valid HTTP objects. Whatever files get left undeleted on

disk can be overwritten with new content. [enter a few race condition bugs].

On shutdown the (up to 2^27 x 512 bytes) set of then-valid records are

dumped to disk in a new swap.state.

>    - Should squid implement a journal (akin to EXT/ReiserFS and others) so

>    the on-disk index structure is always OK?

Been tried. store_log is the journal. You can turn it on if you have any

need for external software to track what Squid cache contains in near

real-time.

Overall it slows Squid down and it is not useful to remember objects

that were saved X days ago and deleted a few seconds later.

Wont protect against swap.state corruption anyway, since the journal/log

file would get truncated/corrupted in the same ways as swap.state.

>    - Is the problem related to clients actively receiving web pages from

>    squid?

Indirectly. They delay added by waiting for them increases the

impatience of external daemon managers and admins chance of using

SIGKILL on the worker.

>    - Could squid's signal handling be adjusted to treat those clients more

>    harshly? That is, terminate the transfers early because squid shutting down

>    is much the same as the network interface going down.

It could. I have actually just applied a patch to Squid-4 that makes

Squid abort idle client connections on teh first SIGTERM and more

cleanly close *all* client FDs at the end of the shutdown_lifetime (or

second SIGTERM) before it uses abort() on whats left. That will probably

be in 3.5.7 as well.

Hopefully that will result in less SWAPFAIL corruption issues in future

for some people. It may or may not help with swap.fail corruption since

thet is a step later.

Long term plan is to handle shutdown in such a way that non-busy Squid

may even exit before shutdown_lifetime is over. Which should definitely

include proper swap.state closure.

>    - Is the only viable solution to use squid to stop the daemon? Does

>    'squid -k shutdown' exit only after the daemon is dead? The squid man page

>    indicates that it sends the shutdown 'signal' but does not await a reply;

>    thus a potentially lengthy pause in system shutdown may be required anyway.

The man page is correct. "squid -k shutdown" sends a SIGTERM signal to

the process listed in squids PID file.

 NP: any other UNIX style process could happily do the same to control

Squid.

The -k process itself exits immediately when that signal is sent.

The running squid daemon manager (or coordinator) process handles that

signal as mentioned above.

>    - Is pausing the system shutdown for 10-15 seconds the only really

>    viable shutdown solution?

In current Squid releases yes. The alternative is to risk the corruption

problems.

>    - Or is it best to delete and rebuild the cache index on every system

>    startup?

No. Just configure shutdown_lifetime to a few seconds, and require the

init scripts system to wait that long plus 2-3 more sec for the whole

bunch of Squid processes to complete normally. Use SIGTERM if really

necessary to speed things up.

There are a occasional exceptions, you may be one, but for most

installations it seems to still work fine that way.

HTH

Amos

_______________________________________________

squid-users mailing list

squid-users@xxxxxxxxxxxxxxxxxxxxx

http://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
http://lists.squid-cache.org/listinfo/squid-users