On 2025-03-02 15:47, Lubos Uhliarik wrote:
2024/10/16 17:52:44 kid10| Adaptation support is off. 2024/10/16 17:52:44 kid10| assertion failed: Queue.cc:388: "EX"
Squid v6 has a few changes that affect SMP startup and shutdown sequences. Since you have ruled out upgrading to a supported version, I would start by turning on shared_memory_locking in hope that the problem lies in a problematic OS configuration.
BTW, that bogus exception text "EX" is a bug. It was fixed in Squid v6. Squid v5.10 has a backport of that fix (commit 31f20fda).
Alex.
We have recently experienced Squid crashes when running Squid in SMP mode with 16 workers (workers 16). While reviewing the Squid cache logs, I noticed that assertion failures sometimes occurred (usually during the shutdown process): 2024/10/16 17:52:44 kid10| Adaptation support is off. 2024/10/16 17:52:44 kid10| assertion failed: Queue.cc:388: "EX" 2024/10/16 17:52:44 kid7| Squid plugin modules loaded: 0 2024/10/16 17:52:44 kid7| Adaptation support is off. 2024/10/16 17:52:44 kid7| assertion failed: Queue.cc:388: "EX" 2024/10/16 17:52:44 kid4| Finished loading MIME types and icons. 2024/10/16 17:52:44 kid4| HTCP Disabled. 2024/10/16 17:52:44 kid4| Squid plugin modules loaded: 0 2024/10/16 17:52:44 kid4| Adaptation support is off. 2024/10/16 17:52:44 kid4| assertion failed: Queue.cc:388: "EX" 2024/10/16 17:52:44 kid3| Finished loading MIME types and icons. 2024/10/16 17:52:44 kid2| Finished loading MIME types and icons. 2024/10/16 17:52:44 kid5| Finished loading MIME types and icons. 2024/10/16 17:52:45 kid3| HTCP Disabled. 2024/10/16 17:52:45 kid2| HTCP Disabled. 2024/10/16 17:52:45 kid5| HTCP Disabled. 2024/10/16 17:52:45 kid5| Squid plugin modules loaded: 0 2024/10/16 17:52:45 kid5| Adaptation support is off. 2024/10/16 17:52:45 kid2| Squid plugin modules loaded: 0 2024/10/16 17:52:45 kid2| Adaptation support is off. 2024/10/16 17:52:45 kid3| Squid plugin modules loaded: 0 2024/10/16 17:52:45 kid3| Adaptation support is off. 2024/10/16 17:52:45 kid5| assertion failed: Queue.cc:388: "EX" 2024/10/16 17:52:45 kid2| assertion failed: Queue.cc:388: "EX" 2024/10/16 17:52:45 kid3| assertion failed: Queue.cc:388: "EX" I observed that each "kid" (worker) crashed once, and then Squid successfully stopped and restarted. Recently, however, the Squid crashes started occurring during the start process. The crashes now happen multiple times per worker, causing Squid to enter a crash loop. This continues until the hard drive fills up with core dumps: 2025/02/18 10:17:17 kid4| assertion failed: Queue.cc:388: "EX" 2025/02/18 10:17:17 kid10| Current Directory is / 2025/02/18 10:17:17 kid10| Starting Squid Cache version 5.5 for x86_64-redhat-linux-gnu... 2025/02/18 10:17:17 kid10| Service Name: squid 2025/02/18 10:17:17 kid10| Process ID 18779 2025/02/18 10:17:17 kid10| Process Roles: worker 2025/02/18 10:17:17 kid10| With 65536 file descriptors available 2025/02/18 10:17:17 kid10| Initializing IP Cache... 2025/02/18 10:17:17 kid10| DNS Socket created at [::], FD 13 2025/02/18 10:17:17 kid10| DNS Socket created at 0.0.0.0, FD 14 2025/02/18 10:17:17 kid10| Adding nameserver 127.0.0.1 from squid.conf 2025/02/18 10:17:17 kid10| Logfile: opening log /var/log/squid/access.log 2025/02/18 10:17:17 kid10| WARNING: log name now starts with a module name. Use 'stdio:/var/log/squid/access.log' 2025/02/18 10:17:17 kid10| Local cache digest enabled; rebuild/rewrite every 3600/3600 sec 2025/02/18 10:17:17 kid10| Store logging disabled 2025/02/18 10:17:17 kid10| Swap maxSize 0 + 262144 KB, estimated 20164 objects 2025/02/18 10:17:17 kid10| Target number of buckets: 1008 2025/02/18 10:17:17 kid10| Using 8192 Store buckets 2025/02/18 10:17:17 kid10| Max Mem size: 262144 KB [shared] 2025/02/18 10:17:17 kid10| Max Swap size: 0 KB 2025/02/18 10:17:17 kid10| Using Least Load store dir selection 2025/02/18 10:17:17 kid10| Current Directory is / 2025/02/18 10:17:17 kid10| Finished loading MIME types and icons. 2025/02/18 10:17:17 kid10| HTCP Disabled. 2025/02/18 10:17:17 kid10| Squid plugin modules loaded: 0 2025/02/18 10:17:17 kid10| Adaptation support is off. 2025/02/18 10:17:17 kid10| assertion failed: Queue.cc:388: "EX" 2025/02/18 10:17:17 kid7| Current Directory is / I investigated the core dump, and the code where Squid is crashing is as follows: #0 0x00007f77b288ba6c in __pthread_kill_implementation () from /lib64/libc.so.6 Missing separate debuginfos, use: dnf debuginfo-install glibc-2.34-125.el9_5.1.x86_64 keyutils-libs-1.6.3-1.el9.x86_64 krb5-libs-1.21.1-4.el9_5.x86_64 libcap-2.48-9.el9_2.x86_64 libcom_err-1.46.5-5.el9.x86_64 libecap-1.0.1-10.el9.x86_64 libgcc-11.5.0-2.el9.x86_64 libgcrypt-1.10.0-11.el9.x86_64 libgpg-error-1.42-5.el9.x86_64 libselinux-3.6-1.el9.x86_64 libstdc++-11.5.0-2.el9.x86_64 libtool-ltdl-2.4.6-46.el9.x86_64 libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64 openssl-libs-3.2.2-6.el9_5.x86_64 pcre2-10.40-6.el9.x86_64 sssd-client-2.9.5-4.el9_5.4.x86_64 systemd-libs-252-46.el9_5.2.x86_64 xz-libs-5.2.5-8.el9_0.x86_64 zlib-1.2.11-40.el9.x86_64 (gdb) bt #0 0x00007f77b288ba6c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f77b283e686 in raise () from /lib64/libc.so.6 #2 0x00007f77b2828833 in abort () from /lib64/libc.so.6 #3 0x0000559bbb412a9d in xassert (msg=<optimized out>, file=<optimized out>, line=<optimized out>) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/debug.cc:624 #4 0x0000559bbb7549f7 in Ipc::MultiQueue::reader (processId=3, this=0x559bbccbf6e0) at ipc/Queue.cc:388 #5 Ipc::MultiQueue::localReader (this=0x559bbccbf6e0) at ipc/Queue.cc:408 #6 0x0000559bbb74c192 in Ipc::BaseMultiQueue::localReader (this=<optimized out>) at ipc/Queue.cc:195 #7 Ipc::BaseMultiQueue::clearAllReaderSignals (this=<optimized out>) at ipc/Queue.cc:155 #8 0x0000559bbb486724 in CollapsedForwarding::HandleNewDataAtStart () at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/CollapsedForwarding.cc:151 #9 0x0000559bbb6aa50b in AsyncCallQueue::fireNext (this=this@entry=0x559bbccd18a0) at base/../../src/base/RefCount.h:73 #10 0x0000559bbb6aa89a in AsyncCallQueue::fire (this=0x559bbccd18a0) at base/AsyncCallQueue.cc:43 #11 0x0000559bbb490da1 in EventLoop::dispatchCalls (this=0x7fff2d084b90) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:144 #12 EventLoop::runOnce (this=0x7fff2d084b90) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:109 #13 0x0000559bbb5929a8 in EventLoop::run (this=0x7fff2d084b90) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:83 #14 SquidMain (argc=argc@entry=5, argv=argv@entry=0x7fff2d084ea8) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1716 #15 0x0000559bbb436086 in SquidMainSafe (argv=0x7fff2d084ea8, argc=5) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1403 #16 main (argc=5, argv=0x7fff2d084ea8) at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1391 (gdb) f 4 #4 0x0000559bbb7549f7 in Ipc::MultiQueue::reader (processId=3, this=0x559bbccbf6e0) at ipc/Queue.cc:388 388 assert(validProcessId(processId)); (gdb) list 383 } 384 385 const Ipc::QueueReader & 386 Ipc::MultiQueue::reader(const int processId) const 387 { 388 assert(validProcessId(processId)); 389 const int index = processId - metadata->theProcessIdOffset; 390 return readers->theReaders[index]; 391 } 392 (gdb) p processId $1 = 3 (gdb) p metadata $15 = {<RefCount<Ipc::Mem::Object<Ipc::MultiQueue::Metadata> >> = {p_ = 0x559bbccd9f30}, <No data fields>} (gdb) p *metadata.p_ $16 = {<Lock> = {_vptr.Lock = 0x559bbb96ea30 <vtable for Ipc::Mem::Object<Ipc::MultiQueue::Metadata>+64>, count_ = 1}, _vptr.Object = 0x559bbb96ea08 <vtable for Ipc::Mem::Object<Ipc::MultiQueue::Metadata>+24>, theSegment = {theFD = 9, theName = {static npos = 18446744073709551615, size_ = 40, len_ = 23, static SizeMax_ = 196607, buf_ = 0x559bbccbc460 "/squid-cf__metadata.shm"}, theMem = 0x7f77b3720000, theSize = 8, theReserved = 0, doUnlink = false}, theObject = 0x7f77b3720000} (gdb) p metadata.p_->theObject $17 = (Ipc::MultiQueue::Metadata *) 0x7f77b3720000 (gdb) p *metadata.p_->theObject $18 = {theProcessCount = 0, theProcessIdOffset = 0} (gdb) p metadata.p_->theObject->theProcessIdOffset $19 = 0 From this investigation, it appears that the theProcessCount and theProcessIdOffset variables are set to 0. However, I don't think this is actually the case, as these variables should be shared in SHM, and the content of those files was not included in the core dump. Have any of you experienced similar crashes? I was trying to search for a similar issue but without success. The backtrace above is from Squid version 5.5. While I understand it is an older version, I believe the IPC code hasn’t changed much since then. Unfortunately, we don't have a reproducer or the possibility of upgrading Squid to a newer version on the running system for testing. It's also worth noting that after restarting Squid, which was stuck in the crash loop, Squid started normally. I suspect it could be caused by uninitialized SHM or something similar. What do you think? Would it help if I set some debug options and share the output with you? I was considering something like: debug_options ALL,1 1,9 54,9 Thanks in advance for your response. Regards, Lubos _______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx https://lists.squid-cache.org/listinfo/squid-users
_______________________________________________ squid-users mailing list squid-users@xxxxxxxxxxxxxxxxxxxxx https://lists.squid-cache.org/listinfo/squid-users