---------- Original e-mail ---------- From: John David Anglin To: linux-parisc@xxxxxxxxxxxxxxx CC: Helge Deller Date: 17. 5. 2024 21:05:19 Subject: [PATCH v2] parisc: Try to fix random segmentation faults in package builds > The majority of random segmentation faults that I have looked at > appear to be memory corruption in memory allocated using mmap and > malloc. > > [...] > > This made it clear that we needed to implement all the required > flush operations using tmpalias routines. This includes flushes > for user and kernel pages. > > This change is the result of that conversion. As far as I can > tell, it fixes the random segmentation faults on c8000. > > Base for patch is 6.8.9. Hello, I applied the patch to a 6.8.9 kernel with Gentoo patches on my C8000 and ran it under heavy load over the weekend. The system has been much more stable than in the past (yay!), but I've still experienced one userspace program crash and a kernel panic. The crash was a memory corruption while compiling Boost (version boost-1.84.0-r3 from Gentoo). It might be caused by a kernel memory handling bug, but it's hard to say. Upon recompiling, the problem didn't manifest again. There was nothing in the syslog and the output is not very informative: ``` gcc.compile.c++ bin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden/instantiate_re2c_lexer.o "hppa2.0-unknown-linux-gnu-g++" -fvisibility-inlines-hidden -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -O2 -pipe -march=2.0 -mschedule=8000 -ggdb -std=c++17 -fPIC -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DBOOST_COBALT_USE_STD_PMR=1 -DNDEBUG -I"." -c -o "bin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden/instantiate_re2c_lexer.o" "libs/wave/src/instantiate_re2c_lexer.cpp" {standard input}: Assembler messages: {standard input}:401704: Error: unknown pseudo-op: `.ule' {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x2 {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x1 {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x1 {standard input}:401704: Error: Unknown opcode: `ag�' {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x3 {standard input}:401704: Error: Unknown opcode: `uƀ' {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x1 {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x10 {standard input}:401704: Error: junk at end of line, first unrecognized character valued 0x10 {standard input}:401704: Error: Unknown opcode: `a ��' {standard input}:401704: Error: Unknown opcode: `x15' {standard input}:796319: Warning: end of file in string; '"' inserted {standard input}:797107: Warning: missing closing '"' {standard input}:797107: Error: Unknown opcode: ` ' gcc.compile.c++ bin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden/instantiate_re2c_lexer_str.o "hppa2.0-unknown-linux-gnu-g++" -fvisibility-inlines-hidden -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -O2 -pipe -march=2.0 -mschedule=8000 -ggdb -std=c++17 -fPIC -pthread -finline-functions -Wno-inline -Wall -fvisibility=hidden -DBOOST_ALL_DYN_LINK=1 -DBOOST_ALL_NO_LIB=1 -DBOOST_COBALT_USE_STD_PMR=1 -DNDEBUG -I"." -c -o "bin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden/instantiate_re2c_lexer_str.o" "libs/wave/src/instantiate_re2c_lexer_str.cpp" ...skipped <pbin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden>libboost_wave.so.1.84.0 for lack of <pbin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden>instantiate_re2c_lexer.o... ...skipped <p/var/tmp/portage/dev-libs/boost-1.84.0-r3/work/boost_1_84_0-.hppa/stage/lib>libboost_wave.so.1.84.0 for lack of <pbin.v2/libs/wave/build/gcc-13.2/gentoorelease/pch-off/threading-multi/visibility-hidden>libboost_wave.so.1.84.0... ...failed updating 1 target... ``` The kernel panic happened after a few days of uptime. I got the following output on the serial console, after which the machine rebooted immediately and the usual boot output followed. ``` [163003.648077] Backtrace: [163003.648077] [<00000000408d7740>] btrfs_add_delayed_data_ref+0x1d4/0x598 [163003.648077] [<00000000407e61cc>] btrfs_alloc_reserved_file_extent+0x158/0x210 [163003.648077] [<0000000040819638>] insert_reserved_file_extent+0x77c/0x840 [163003.648077] [<0000000040825220>] btrfs_finish_one_ordered+0xa54/0x1310 [163003.648077] [<0000000040825b0c>] btrfs_finish_ordered_io+0x30/0x70 [163003.648077] [<000000004085cadc>] finish_ordered_fn+0x38/0x78 [163003.648077] [<0000000040891c9c>] btrfs_work_helper+0x208/0x790 [163003.648077] [<000000004025eb74>] process_one_work+0x228/0x540 [163003.648077] [<000000004025f1ac>] worker_thread+0x320/0x760 [163003.648077] [<0000000040273020>] kthread+0x254/0x280 [163003.648077] [<00000000401df020>] ret_from_kernel_thread+0x20/0x28 [163003.648077] [163003.648077] [163003.648077] Page fault: no context: Code=15 (Data TLB miss fault) at addr 0000003b1047cdee [163003.648077] CPU: 3 PID: 9804 Comm: kworker/u8:4 Not tainted 6.8.9-gentoo-64bit-debug #2 [163003.648077] Hardware name: 9000/785/C8000 [163003.648077] Workqueue: btrfs-endio-write btrfs_work_helper [163003.648077] [163003.648077] YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI [163003.648077] PSW: 00001000000001000000000000001111 Not tainted [163003.648077] r00-03 000000000804000f 0000000041235180 0000000040520374 00000003202e8d50 [163003.648077] r04-07 0000000041165180 0000000055076f20 0000000000000001 00000000408d7740 [163003.648077] r08-11 0000000000000c40 00000000000000b0 0000000000000000 00000000417b72a8 [163003.648077] r12-15 0000000000455a10 00000000556f2000 0000000000000001 00000002a326d620 [163003.648077] r16-19 00000003202e8b68 0000000000000000 0000000000000000 82140f831247cc96 [163003.648077] r20-23 00000000000d486a 0000000057ec22e0 00000000000d486b 000000004147c0d0 [163003.648077] r24-27 eecd47103bccf4da daf4cc3b1047cdee daf4cc3b1047cd96 0000000041165180 [163003.648077] r28-31 0000000000000058 00cd001000cc00da 00000003202e8e30 250b33c4efb8326a [163003.648077] sr00-03 0000000007067400 0000000000000000 0000000000000000 0000000007067400 [163003.648077] sr04-07 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [163003.648077] [163003.648077] IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000405202d8 00000000405202dc [163003.648077] IIR: 0f9a00dc ISR: 000000001af4cc00 IOR: 0000003b1047cdee [163003.648077] CPU: 3 CR30: 0000000057ec22e0 CR31: fffffffffffeffff [163003.648077] ORIG_R28: 00000000000009de [163003.648077] IAOQ[0]: kmem_cache_alloc+0x10c/0x520 [163003.648077] IAOQ[1]: kmem_cache_alloc+0x110/0x520 [163003.648077] RP(r2): kmem_cache_alloc+0x1a8/0x520 [163003.648077] Backtrace: [163003.648077] [<00000000408d7740>] btrfs_add_delayed_data_ref+0x1d4/0x598 [163003.648077] [<00000000407e61cc>] btrfs_alloc_reserved_file_extent+0x158/0x210 [163003.648077] [<0000000040819638>] insert_reserved_file_extent+0x77c/0x840 [163003.648077] [<0000000040825220>] btrfs_finish_one_ordered+0xa54/0x1310 [163003.648077] [<0000000040825b0c>] btrfs_finish_ordered_io+0x30/0x70 [163003.648077] [<000000004085cadc>] finish_ordered_fn+0x38/0x78 [163003.648077] [<0000000040891c9c>] btrfs_work_helper+0x208/0x790 [163003.648077] [<000000004025eb74>] process_one_work+0x228/0x540 [163003.648077] [<000000004025f1ac>] worker_thread+0x320/0x760 [163003.648077] [<0000000040273020>] kthread+0x254/0x280 [163003.648077] [<00000000401df020>] ret_from_kernel_thread+0x20/0x28 [163003.648077] <Cpu3> 0300109103e00000 0000000000000000 CC_PROCS_ENTRY_OUT [163003.648077] Kernel panic - not syncing: Page fault: no context <Cpu3> 160012c803e00000 0300000000000000 CC_MPS_SLAVE_DISPATCHER_ENT <Cpu3> 0300096303e00000 0000000008000008 CC_BOOT_MEM_CPU_RENDEZVOUS <Cpu3> 160012d303e00000 0300000000000000 CC_MPS_SLAVE_SLEEPING <Cpu3> 080012d403e00000 000000f0f0d07dc0 CC_MPS_SLAVE_SLEEP_ADDR <Cpu2> 3400006302e00000 0000000000000000 CC_BOOT_START ```