On 28 March 2023, I wrote,
Looking at sysdeps/unix/sysv/linux/wait3.c, I guess the only possible place for a buffer overrun would be struct __rusage64 usage64. https://sources.debian.org/src/glibc/2.36-8/sysdeps/unix/sysv/linux/wait3.c/?hl=41#L41
... but now I see the usage64 variable is not involved at all because __wait3() was passed a NULL pointer: https://sources.debian.org/src/dash/0.5.12-2/src/jobs.c/?hl=1179#L1179 So NULL (rather than &usage64) was passed to the wait4() syscall which means the kernel didn't invoke copy_to_user() at all. AFAICS there's no possible buffer overflow in __wait3(), __wait4_time64() etc. That suggests to a problem with GCC's SSP detector. Here's a more complete backtrace and some disassembly. # gdb GNU gdb (Debian 13.1-2) 13.1 ... (gdb) set osabi GNU/Linux (gdb) file /bin/dash Reading symbols from /bin/dash... Reading symbols from /usr/lib/debug/.build-id/aa/4160f84f3eeee809c554cb9f3e1ef0686b8dcc.debug... (gdb) (gdb) core /root/core.0 warning: core file may not match specified executable file. [New LWP 366] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/m68k-linux-gnu/libthread_db.so.1". Core was generated by `/bin/sh /etc/init.d/mountkernfs.sh reload'. Program terminated with signal SIGABRT, Aborted. #0 __pthread_kill_implementation (threadid=3222954656, signo=6, no_tid=0) at pthread_kill.c:44 44 pthread_kill.c: No such file or directory. (gdb) bt #0 __pthread_kill_implementation (threadid=3222954656, signo=6, no_tid=0) at pthread_kill.c:44 #1 0xc00a7080 in __pthread_kill_internal (signo=6, threadid=3222954656) at pthread_kill.c:78 #2 __GI___pthread_kill (threadid=3222954656, signo=6) at pthread_kill.c:89 #3 0xc0064c22 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 #4 0xc0052faa in __GI_abort () at abort.c:79 #5 0xc009b328 in __libc_message (action=<optimized out>, fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:155 #6 0xc012a3c2 in __GI___fortify_fail ( msg=0xc0182c5e "stack smashing detected") at fortify_fail.c:26 #7 0xc012a3a0 in __stack_chk_fail () at stack_chk_fail.c:24 #8 0xc00e0172 in __wait3 (stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait3.c:41 #9 0xd000c38e in waitproc (status=0xefee110e, block=1) at jobs.c:1179 #10 waitone (block=1, job=0xd0021930) at jobs.c:1055 #11 0xd000c5b8 in dowait (block=1, jp=0xd0021930) at jobs.c:1137 #12 0xd000ddb0 in waitforjob (jp=0xd0021930) at jobs.c:1014 #13 0xd000aade in expbackq (flag=324, cmd=0xd00222c4) at expand.c:520 #14 argstr (p=<optimized out>, flag=68) at expand.c:335 #15 0xd000b5ce in expandarg (arg=0xd00222ac, arglist=0xefee13bc, flag=4) at expand.c:192 #16 0xd0007e2a in evalcommand (cmd=<optimized out>, flags=<optimized out>) at eval.c:855 #17 0xd0006ffc in evaltree (n=0xd0022294, flags=0) at eval.c:300 #18 0xd0006e96 in evaltree (n=0xd0022294, flags=0) at eval.c:300 #19 0xd0006e6a in evaltree (n=0xd002224c, flags=0) at eval.c:292 #20 0xd0006e6a in evaltree (n=0xd00220d4, flags=0) at eval.c:292 #21 0xd0006e6a in evaltree (n=0xd002208c, flags=0) at eval.c:292 #22 0xd000746a in evalfun (func=0xd0022078, argc=<optimized out>, argv=0xd001e61c <stackbase+376>, flags=<optimized out>) at eval.c:1009 #23 0xd0008176 in evalcommand (cmd=<optimized out>, flags=<optimized out>) at eval.c:921 #24 0xd0006ffc in evaltree (n=0xd001e588 <stackbase+228>, flags=1) at eval.c:300 #25 0xd00084c8 in evaltreenr (flags=1, n=0xd001e588 <stackbase+228>) at eval.c:347 #26 evalbackcmd (n=<optimized out>, result=0xefee17d4) at eval.c:650 #27 0xd000a984 in expbackq (flag=324, cmd=0xd001e588 <stackbase+228>) at expand.c:495 #28 argstr (p=<optimized out>, flag=68) at expand.c:335 #29 0xd000b5ce in expandarg (arg=0xd001e5b0 <stackbase+268>, arglist=0xefee191c, flag=4) at expand.c:192 #30 0xd0007e2a in evalcommand (cmd=<optimized out>, flags=<optimized out>) at eval.c:855 #31 0xd0006ffc in evaltree (n=0xd001e5c0 <stackbase+284>, flags=0) at eval.c:300 #32 0xd000e3c0 in cmdloop (top=0) at main.c:246 #33 0xd000e588 in dotcmd (argc=2, argv=<optimized out>) at main.c:341 #34 0xd0007a12 in evalbltin (cmd=0xd001b598 <builtincmd>, argc=<optimized out>, argv=<optimized out>, flags=<optimized out>) at eval.c:967 #35 0xd00080ca in evalcommand (cmd=<optimized out>, flags=<optimized out>) at eval.c:910 #36 0xd0006ffc in evaltree (n=0xd001e4e8 <stackbase+68>, flags=0) at eval.c:300 #37 0xd000e3c0 in cmdloop (top=1) at main.c:246 #38 0xd0005018 in main (argc=<optimized out>, argv=<optimized out>) at main.c:181 (gdb) frame 8 #8 0xc00e0172 in __wait3 (stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out>) at ../sysdeps/unix/sysv/linux/wait3.c:41 41 ../sysdeps/unix/sysv/linux/wait3.c: No such file or directory. (gdb) info frame Stack level 8, frame at 0xefee10e0: pc = 0xc00e0172 in __wait3 (../sysdeps/unix/sysv/linux/wait3.c:41); saved pc = 0xd000c38e called by frame at 0xefee11dc, caller of frame at 0xefee106c source language c. Arglist at 0xefee10d8, args: stat_loc=<optimized out>, options=<optimized out>, usage=<optimized out> Locals at 0xefee10d8, Previous frame's sp is 0xefee10e0 Saved registers: a2 at 0xefee106c, a3 at 0xefee1070, a5 at 0xefee1074, fp at 0xefee10d8, pc at 0xefee10dc (gdb) x/32z 0xefee1060 0xefee1060: 0xc0182c5e 0xc0198000 0xc00e0172 0xd001e718 0xefee1070: 0xd001e498 0xd001b874 0x00170700 0x00170700 0xefee1080: 0x00170700 0x00005360 0x0000e920 0x00000006 0xefee1090: 0x00002000 0x00000002 0x00171f20 0x00171f20 0xefee10a0: 0x00171f20 0x000000e0 0x000000e0 0x00000006 0xefee10b0: 0x00000004 0x00000004 0x00000174 0x00000000 0xefee10c0: 0x00000000 0x00000008 0x0000016f 0x0000000a 0xefee10d0: 0x00000000 0x00ac3dbe 0xd001c1ec 0xd000c38e (gdb) 0xefee10e0: 0xefee111e 0x00000000 0x00000000 0x00000001 0xefee10f0: 0x00000001 0xefee1284 0x00000044 0xd0017714 0xefee1100: 0x00000100 0xd0021930 0xd001c1ec 0xd001e498 0xefee1110: 0xd001b874 0xefee1308 0xc0023e8c 0xefee0000 0xefee1120: 0x00000044 0xd0017714 0x00000100 0xefee1274 0xefee1130: 0xc0023e8c 0xd001c028 0xd001b874 0xefee1208 0xefee1140: 0x00000000 0xc0023e8c 0x00000000 0x00000000 0xefee1150: 0x00000000 0x00000000 0x00000000 0x00000000 (gdb) print &usage64 $1 = (struct __rusage64 *) 0xefee107c (gdb) disass Dump of assembler code for function __wait3: 0xc00e0070 <+0>: linkw %fp,#-96 0xc00e0074 <+4>: moveml %a2-%a3/%a5,%sp@- 0xc00e0078 <+8>: lea %pc@(0xc0198000),%a5 0xc00e0080 <+16>: movel %fp@(8),%d0 0xc00e0084 <+20>: moveal %fp@(16),%a2 0xc00e0088 <+24>: moveal %a5@(108),%a3 0xc00e008c <+28>: movel %a3@,%fp@(-4) 0xc00e0090 <+32>: tstl %a2 0xc00e0092 <+34>: beqw 0xc00e0152 <__wait3+226> 0xc00e0096 <+38>: pea %fp@(-92) 0xc00e009a <+42>: movel %fp@(12),%sp@- 0xc00e009e <+46>: movel %d0,%sp@- 0xc00e00a0 <+48>: pea 0xffffffff 0xc00e00a4 <+52>: bsrl 0xc00e0174 <__GI___wait4_time64> 0xc00e00aa <+58>: lea %sp@(16),%sp 0xc00e00ae <+62>: tstl %d0 0xc00e00b0 <+64>: bgts 0xc00e00c8 <__wait3+88> 0xc00e00b2 <+66>: moveal %fp@(-4),%a0 0xc00e00b6 <+70>: movel %a3@,%d1 0xc00e00b8 <+72>: cmpl %a0,%d1 0xc00e00ba <+74>: bnew 0xc00e016c <__wait3+252> 0xc00e00be <+78>: moveml %fp@(-108),%a2-%a3/%a5 0xc00e00c4 <+84>: unlk %fp 0xc00e00c6 <+86>: rts 0xc00e00c8 <+88>: pea 0x44 0xc00e00cc <+92>: clrl %sp@- 0xc00e00ce <+94>: pea %a2@(4) 0xc00e00d2 <+98>: movel %d0,%fp@(-96) 0xc00e00d6 <+102>: bsrl 0xc00b8850 <__GI_memset> 0xc00e00dc <+108>: movel %fp@(-88),%a2@ 0xc00e00e0 <+112>: movel %fp@(-80),%a2@(4) 0xc00e00e6 <+118>: movel %fp@(-72),%a2@(8) 0xc00e00ec <+124>: movel %fp@(-64),%a2@(12) 0xc00e00f2 <+130>: movel %fp@(-60),%a2@(16) 0xc00e00f8 <+136>: movel %fp@(-56),%a2@(20) 0xc00e00fe <+142>: movel %fp@(-52),%a2@(24) 0xc00e0104 <+148>: movel %fp@(-48),%a2@(28) 0xc00e010a <+154>: movel %fp@(-44),%a2@(32) 0xc00e0110 <+160>: movel %fp@(-40),%a2@(36) 0xc00e0116 <+166>: movel %fp@(-36),%a2@(40) 0xc00e011c <+172>: movel %fp@(-32),%a2@(44) 0xc00e0122 <+178>: movel %fp@(-28),%a2@(48) 0xc00e0128 <+184>: movel %fp@(-24),%a2@(52) 0xc00e012e <+190>: movel %fp@(-20),%a2@(56) 0xc00e0134 <+196>: movel %fp@(-16),%a2@(60) 0xc00e013a <+202>: movel %fp@(-12),%a2@(64) 0xc00e0140 <+208>: movel %fp@(-8),%a2@(68) 0xc00e0146 <+214>: lea %sp@(12),%sp 0xc00e014a <+218>: movel %fp@(-96),%d0 0xc00e014e <+222>: braw 0xc00e00b2 <__wait3+66> 0xc00e0152 <+226>: clrl %sp@- 0xc00e0154 <+228>: movel %fp@(12),%sp@- 0xc00e0158 <+232>: movel %d0,%sp@- 0xc00e015a <+234>: pea 0xffffffff 0xc00e015e <+238>: bsrl 0xc00e0174 <__GI___wait4_time64> 0xc00e0164 <+244>: lea %sp@(16),%sp 0xc00e0168 <+248>: braw 0xc00e00b2 <__wait3+66> 0xc00e016c <+252>: bsrl 0xc012a38c <__stack_chk_fail> End of assembler dump. (gdb) Note that __wait3(stat_loc, options, NULL) reduces to, return __wait4_time64(-1, stat_loc, options, NULL); So I think the branch at __wait3+34 was taken, and after bsr __GI___wait4_time64, the branch at __wait3+248 would have been taken. Then the canary located at %fp@(-4) was compared with %a3@. From the hex dump above, %fp@(-4) is 0xd000c38e. As for %a3, we know its value when SIGABRT was caught, and if I'm not mistaken, %a3 was not altered by __stack_chk_fail or __GI___fortify_fail... (gdb) frame 7 #7 0xc012a3a0 in __stack_chk_fail () at stack_chk_fail.c:24 24 stack_chk_fail.c: No such file or directory. (gdb) info frame Stack level 7, frame at 0xefee106c: pc = 0xc012a3a0 in __stack_chk_fail (stack_chk_fail.c:24); saved pc = 0xc00e0172 called by frame at 0xefee10e0, caller of frame at 0xefee1060 source language c. Arglist at 0xefee105c, args: Locals at 0xefee105c, Previous frame's sp is 0xefee106c Saved registers: a5 at 0xefee1064, pc at 0xefee1068 (gdb) disass Dump of assembler code for function __stack_chk_fail: 0xc012a38c <+0>: movel %a5,%sp@- 0xc012a38e <+2>: lea %pc@(0xc0198000),%a5 0xc012a396 <+10>: movel %a5@(10696),%sp@- 0xc012a39a <+14>: bsrl 0xc012a3a0 <__GI___fortify_fail> End of assembler dump. (gdb) frame 6 #6 0xc012a3c2 in __GI___fortify_fail ( msg=0xc0182c5e "stack smashing detected") at fortify_fail.c:26 26 fortify_fail.c: No such file or directory. (gdb) info frame Stack level 6, frame at 0xefee1060: pc = 0xc012a3c2 in __GI___fortify_fail (fortify_fail.c:26); saved pc = 0xc012a3a0 called by frame at 0xefee106c, caller of frame at 0xefee1044 source language c. Arglist at 0xefee1040, args: msg=0xc0182c5e "stack smashing detected" Locals at 0xefee1040, Previous frame's sp is 0xefee1060 Saved registers: d2 at 0xefee1050, d3 at 0xefee1054, a5 at 0xefee1058, pc at 0xefee105c (gdb) disass Dump of assembler code for function __GI___fortify_fail: 0xc012a3a0 <+0>: moveml %d2-%d3/%a5,%sp@- 0xc012a3a4 <+4>: lea %pc@(0xc0198000),%a5 0xc012a3ac <+12>: movel %sp@(16),%d3 0xc012a3b0 <+16>: movel %a5@(10700),%d2 0xc012a3b4 <+20>: movel %d3,%sp@- 0xc012a3b6 <+22>: movel %d2,%sp@- 0xc012a3b8 <+24>: pea 0x1 0xc012a3bc <+28>: bsrl 0xc009b1d8 <__libc_message> => 0xc012a3c2 <+34>: lea %sp@(12),%sp 0xc012a3c6 <+38>: movel %d3,%sp@- 0xc012a3c8 <+40>: movel %d2,%sp@- 0xc012a3ca <+42>: pea 0x1 0xc012a3ce <+46>: bsrl 0xc009b1d8 <__libc_message> 0xc012a3d4 <+52>: lea %sp@(12),%sp 0xc012a3d8 <+56>: bras 0xc012a3b4 <__GI___fortify_fail+20> End of assembler dump. (gdb) info reg d0 0x0 0 d1 0x16e 366 d2 0xc0182c76 -1072157578 d3 0xc0182c5e -1072157602 d4 0xefee1122 -269610718 d5 0x1 1 d6 0xd0021930 -805168848 d7 0x100 256 a0 0xc01a62a0 0xc01a62a0 a1 0xffffffe6 0xffffffe6 a2 0x0 0x0 a3 0xefee1068 0xefee1068 a4 0xd001e71c 0xd001e71c <pending_sig> a5 0xc0198000 0xc0198000 fp 0xefee10d8 0xefee10d8 sp 0xefee1044 0xefee1044 ps 0x10 [ X ] pc 0xc012a3c2 0xc012a3c2 <__GI___fortify_fail+34> fpcontrol 0x0 0 fpstatus 0x0 0 fpiaddr 0x0 0x0 So %a3 was a pointer into stack frame 6?? (gdb) x/z $a3 0xefee1068: 0xc00e0172 Clearly 0xd000c38e != 0xc00e0172 (that is, %fp@(-4) != %a3@) but did the canary value change? It rather looks like the canary pointer is wrong... Another way to find the value of %a3 during __wait3() execution is to look at its initialization: moveal %a5@(108),%a3. And we can see from 'info frame' above that __stack_chk_fail() saved %a5 at 0xefee1064. (gdb) x/4z 0xefee1060 0xefee1060: 0xc0182c5e 0xc0198000 0xc00e0172 0xd001e718 (gdb) x/z *0xefee1064+108 0xc019806c: Cannot access memory at address 0xc019806c Anyway, if the analysis is right (hopefully someone can confirm that) this looks like a GCC bug. I'm not sure why it only shows up during (sysvinit) init script execution. The canary value is derived from /dev/urandom so I guess the failure is intermittent because it is connected to kernel PRNG state during early startup.