Search squid archive

Re: assertion failed: Queue.cc:388: "EX"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Alex,

Thanks for your reply!

On Mon, Mar 3, 2025 at 3:44 PM Alex Rousskov <rousskov@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
On 2025-03-02 15:47, Lubos Uhliarik wrote:


> 2024/10/16 17:52:44 kid10| Adaptation support is off.
> 2024/10/16 17:52:44 kid10| assertion failed: Queue.cc:388: "EX"

Squid v6 has a few changes that affect SMP startup and shutdown
sequences. Since you have ruled out upgrading to a supported version, I
would start by turning on shared_memory_locking in hope that the problem
lies in a problematic OS configuration.

BTW, that bogus exception text "EX" is a bug. It was fixed in Squid v6.
Squid v5.10 has a backport of that fix (commit 31f20fda).


I was able to resolve the issue, so in case someone encounters the same backtrace,
here’s what happened:

In the end, we discovered that a customer was running two Squid instances on the same
machine—one SMP instance with 16 workers and one single-process instance. For a long
time, the single-process Squid instance would start first by chance, followed by
the SMP instance, which overwrote the shared memory. However, at some point,
the SMP instance started first, and when the single-process Squid instance started
second, it overwrote the SHM memory (resetting theProcessCount to 1) of the
SMP instance, causing it to crash with the mentioned backtrace.

The fix is simple—just specify a service name using the -n Squid parameter for
one of the Squid instances.


Alex.


> We have recently experienced Squid crashes when running Squid in SMP
> mode with 16 workers (workers 16). While reviewing the Squid cache
> logs, I noticed that assertion failures sometimes occurred (usually
> during the shutdown process):
>
> 2024/10/16 17:52:44 kid10| Adaptation support is off.
> 2024/10/16 17:52:44 kid10| assertion failed: Queue.cc:388: "EX"
> 2024/10/16 17:52:44 kid7| Squid plugin modules loaded: 0
> 2024/10/16 17:52:44 kid7| Adaptation support is off.
> 2024/10/16 17:52:44 kid7| assertion failed: Queue.cc:388: "EX"
> 2024/10/16 17:52:44 kid4| Finished loading MIME types and icons.
> 2024/10/16 17:52:44 kid4| HTCP Disabled.
> 2024/10/16 17:52:44 kid4| Squid plugin modules loaded: 0
> 2024/10/16 17:52:44 kid4| Adaptation support is off.
> 2024/10/16 17:52:44 kid4| assertion failed: Queue.cc:388: "EX"
> 2024/10/16 17:52:44 kid3| Finished loading MIME types and icons.
> 2024/10/16 17:52:44 kid2| Finished loading MIME types and icons.
> 2024/10/16 17:52:44 kid5| Finished loading MIME types and icons.
> 2024/10/16 17:52:45 kid3| HTCP Disabled.
> 2024/10/16 17:52:45 kid2| HTCP Disabled.
> 2024/10/16 17:52:45 kid5| HTCP Disabled.
> 2024/10/16 17:52:45 kid5| Squid plugin modules loaded: 0
> 2024/10/16 17:52:45 kid5| Adaptation support is off.
> 2024/10/16 17:52:45 kid2| Squid plugin modules loaded: 0
> 2024/10/16 17:52:45 kid2| Adaptation support is off.
> 2024/10/16 17:52:45 kid3| Squid plugin modules loaded: 0
> 2024/10/16 17:52:45 kid3| Adaptation support is off.
> 2024/10/16 17:52:45 kid5| assertion failed: Queue.cc:388: "EX"
> 2024/10/16 17:52:45 kid2| assertion failed: Queue.cc:388: "EX"
> 2024/10/16 17:52:45 kid3| assertion failed: Queue.cc:388: "EX"
>
> I observed that each "kid" (worker) crashed once, and then Squid
> successfully stopped and restarted.
>
> Recently, however, the Squid crashes started occurring during the
> start process. The crashes now happen multiple times per worker,
> causing Squid to enter a crash loop. This continues until the hard
> drive fills up with core dumps:
>
> 2025/02/18 10:17:17 kid4| assertion failed: Queue.cc:388: "EX"
> 2025/02/18 10:17:17 kid10| Current Directory is /
> 2025/02/18 10:17:17 kid10| Starting Squid Cache version 5.5 for
> x86_64-redhat-linux-gnu...
> 2025/02/18 10:17:17 kid10| Service Name: squid
> 2025/02/18 10:17:17 kid10| Process ID 18779
> 2025/02/18 10:17:17 kid10| Process Roles: worker
> 2025/02/18 10:17:17 kid10| With 65536 file descriptors available
> 2025/02/18 10:17:17 kid10| Initializing IP Cache...
> 2025/02/18 10:17:17 kid10| DNS Socket created at [::], FD 13
> 2025/02/18 10:17:17 kid10| DNS Socket created at 0.0.0.0, FD 14
> 2025/02/18 10:17:17 kid10| Adding nameserver 127.0.0.1 from squid.conf
> 2025/02/18 10:17:17 kid10| Logfile: opening log /var/log/squid/access.log
> 2025/02/18 10:17:17 kid10| WARNING: log name now starts with a module
> name. Use 'stdio:/var/log/squid/access.log'
> 2025/02/18 10:17:17 kid10| Local cache digest enabled; rebuild/rewrite
> every 3600/3600 sec
> 2025/02/18 10:17:17 kid10| Store logging disabled
> 2025/02/18 10:17:17 kid10| Swap maxSize 0 + 262144 KB, estimated 20164 objects
> 2025/02/18 10:17:17 kid10| Target number of buckets: 1008
> 2025/02/18 10:17:17 kid10| Using 8192 Store buckets
> 2025/02/18 10:17:17 kid10| Max Mem  size: 262144 KB [shared]
> 2025/02/18 10:17:17 kid10| Max Swap size: 0 KB
> 2025/02/18 10:17:17 kid10| Using Least Load store dir selection
> 2025/02/18 10:17:17 kid10| Current Directory is /
> 2025/02/18 10:17:17 kid10| Finished loading MIME types and icons.
> 2025/02/18 10:17:17 kid10| HTCP Disabled.
> 2025/02/18 10:17:17 kid10| Squid plugin modules loaded: 0
> 2025/02/18 10:17:17 kid10| Adaptation support is off.
> 2025/02/18 10:17:17 kid10| assertion failed: Queue.cc:388: "EX"
> 2025/02/18 10:17:17 kid7| Current Directory is /
>
> I investigated the core dump, and the code where Squid is crashing is
> as follows:
>
> #0  0x00007f77b288ba6c in __pthread_kill_implementation () from /lib64/libc.so.6
> Missing separate debuginfos, use: dnf debuginfo-install
> glibc-2.34-125.el9_5.1.x86_64 keyutils-libs-1.6.3-1.el9.x86_64
> krb5-libs-1.21.1-4.el9_5.x86_64 libcap-2.48-9.el9_2.x86_64
> libcom_err-1.46.5-5.el9.x86_64 libecap-1.0.1-10.el9.x86_64
> libgcc-11.5.0-2.el9.x86_64 libgcrypt-1.10.0-11.el9.x86_64
> libgpg-error-1.42-5.el9.x86_64 libselinux-3.6-1.el9.x86_64
> libstdc++-11.5.0-2.el9.x86_64 libtool-ltdl-2.4.6-46.el9.x86_64
> libzstd-1.5.1-2.el9.x86_64 lz4-libs-1.9.3-5.el9.x86_64
> openssl-libs-3.2.2-6.el9_5.x86_64 pcre2-10.40-6.el9.x86_64
> sssd-client-2.9.5-4.el9_5.4.x86_64 systemd-libs-252-46.el9_5.2.x86_64
> xz-libs-5.2.5-8.el9_0.x86_64 zlib-1.2.11-40.el9.x86_64
> (gdb) bt
> #0  0x00007f77b288ba6c in __pthread_kill_implementation () from /lib64/libc.so.6
> #1  0x00007f77b283e686 in raise () from /lib64/libc.so.6
> #2  0x00007f77b2828833 in abort () from /lib64/libc.so.6
> #3  0x0000559bbb412a9d in xassert (msg=<optimized out>,
> file=<optimized out>, line=<optimized out>) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/debug.cc:624
> #4  0x0000559bbb7549f7 in Ipc::MultiQueue::reader (processId=3,
> this=0x559bbccbf6e0) at ipc/Queue.cc:388
> #5  Ipc::MultiQueue::localReader (this=0x559bbccbf6e0) at ipc/Queue.cc:408
> #6  0x0000559bbb74c192 in Ipc::BaseMultiQueue::localReader
> (this=<optimized out>) at ipc/Queue.cc:195
> #7  Ipc::BaseMultiQueue::clearAllReaderSignals (this=<optimized out>)
> at ipc/Queue.cc:155
> #8  0x0000559bbb486724 in CollapsedForwarding::HandleNewDataAtStart ()
> at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/CollapsedForwarding.cc:151
> #9  0x0000559bbb6aa50b in AsyncCallQueue::fireNext
> (this=this@entry=0x559bbccd18a0) at base/../../src/base/RefCount.h:73
> #10 0x0000559bbb6aa89a in AsyncCallQueue::fire (this=0x559bbccd18a0)
> at base/AsyncCallQueue.cc:43
> #11 0x0000559bbb490da1 in EventLoop::dispatchCalls
> (this=0x7fff2d084b90) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:144
> #12 EventLoop::runOnce (this=0x7fff2d084b90) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:109
> #13 0x0000559bbb5929a8 in EventLoop::run (this=0x7fff2d084b90) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/EventLoop.cc:83
> #14 SquidMain (argc=argc@entry=5, argv=argv@entry=0x7fff2d084ea8) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1716
> #15 0x0000559bbb436086 in SquidMainSafe (argv=0x7fff2d084ea8, argc=5)
> at /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1403
> #16 main (argc=5, argv=0x7fff2d084ea8) at
> /usr/src/debug/squid-5.5-14.el9_5.3.x86_64/src/main.cc:1391
> (gdb) f 4
> #4  0x0000559bbb7549f7 in Ipc::MultiQueue::reader (processId=3,
> this=0x559bbccbf6e0) at ipc/Queue.cc:388
> 388        assert(validProcessId(processId));
> (gdb) list
> 383    }
> 384
> 385    const Ipc::QueueReader &
> 386    Ipc::MultiQueue::reader(const int processId) const
> 387    {
> 388        assert(validProcessId(processId));
> 389        const int index = processId - metadata->theProcessIdOffset;
> 390        return readers->theReaders[index];
> 391    }
> 392
> (gdb) p processId
> $1 = 3
> (gdb) p metadata
> $15 = {<RefCount<Ipc::Mem::Object<Ipc::MultiQueue::Metadata> >> = {p_
> = 0x559bbccd9f30}, <No data fields>} (gdb) p *metadata.p_
> $16 = {<Lock> = {_vptr.Lock = 0x559bbb96ea30 <vtable for
> Ipc::Mem::Object<Ipc::MultiQueue::Metadata>+64>, count_ = 1},
>    _vptr.Object = 0x559bbb96ea08 <vtable for
> Ipc::Mem::Object<Ipc::MultiQueue::Metadata>+24>, theSegment = {theFD =
> 9, theName = {static npos = 18446744073709551615, size_ = 40, len_ =
> 23,
>        static SizeMax_ = 196607, buf_ = 0x559bbccbc460
> "/squid-cf__metadata.shm"}, theMem = 0x7f77b3720000, theSize = 8,
> theReserved = 0, doUnlink = false}, theObject = 0x7f77b3720000}
> (gdb) p metadata.p_->theObject
> $17 = (Ipc::MultiQueue::Metadata *) 0x7f77b3720000
> (gdb) p *metadata.p_->theObject
> $18 = {theProcessCount = 0, theProcessIdOffset = 0}
> (gdb) p metadata.p_->theObject->theProcessIdOffset
> $19 = 0
>
>  From this investigation, it appears that the theProcessCount and
> theProcessIdOffset variables are set to 0. However, I don't think this
> is actually the case, as these variables should be shared in SHM, and
> the content of those files was not included in the core dump.
>
> Have any of you experienced similar crashes? I was trying to search
> for a similar issue but without success.
>
> The backtrace above is from Squid version 5.5. While I understand it
> is an older version, I believe the IPC code hasn’t changed much since
> then. Unfortunately, we don't have a reproducer or the possibility of
> upgrading Squid to a newer version on the running system for testing.
>
> It's also worth noting that after restarting Squid, which was stuck in
> the crash loop, Squid started normally. I suspect it could be caused
> by uninitialized SHM or something similar. What do you think?
>
> Would it help if I set some debug options and share the output with
> you? I was considering something like:
>
> debug_options ALL,1 1,9 54,9
>
> Thanks in advance for your response.
>
> Regards,
> Lubos
>
> _______________________________________________
> squid-users mailing list
> squid-users@xxxxxxxxxxxxxxxxxxxxx
> https://lists.squid-cache.org/listinfo/squid-users

_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users

Lubos 
_______________________________________________
squid-users mailing list
squid-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.squid-cache.org/listinfo/squid-users

[Index of Archives]     [Linux Audio Users]     [Samba]     [Big List of Linux Books]     [Linux USB]     [Yosemite News]

  Powered by Linux