Here is the interesting part of /var/log/kern.log >> Last segfault, on same program source, ut compiled as a 32-bit application Jul 18 17:58:07 stadler kernel: [198552.353997] tweaklib2[2821]: segfault at 29a73c ip ffff80010017ce5c (rpc ffff80010014b218) sp 000007feffaf88a1 error 1 in libtweaklib1.so[ffff800100128000+72000] >> This did not generate any bug (there are many more segfaults before, all generated by faulty Free Pascal compiled executables... >> Here is the start of the kernel bug itself >> It probably only happened when running tweaklib2 from strace Jul 18 20:04:02 stadler kernel: [206085.986213] usercopy: Kernel memory exposure attempt detected from process stack (offset 0, size 128)! Jul 18 20:04:02 stadler kernel: [206085.986383] kernel BUG at /build/linux-aIPx1G/linux-4.17.6/mm/usercopy.c:100! Jul 18 20:04:02 stadler kernel: [206085.986394] \|/ ____ \|/ Jul 18 20:04:02 stadler kernel: [206085.986394] "@'/ .. \`@" Jul 18 20:04:02 stadler kernel: [206085.986394] /_| \__/ |_\ Jul 18 20:04:02 stadler kernel: [206085.986394] \__U_/ Jul 18 20:04:02 stadler kernel: [206085.986402] tweaklib2(4354): Kernel bad sw trap 5 [#1] Jul 18 20:04:02 stadler kernel: [206085.986416] CPU: 25 PID: 4354 Comm: tweaklib2 Not tainted 4.17.0-1-sparc64-smp #1 Debian 4.17.6-1 Jul 18 20:04:02 stadler kernel: [206085.986427] TSTATE: 0000004411001605 TPC: 0000000000634620 TNPC: 0000000000634624 Y: 00000001 Not tainted Jul 18 20:04:02 stadler kernel: [206085.986448] TPC: <usercopy_abort+0x80/0xa0> Jul 18 20:04:02 stadler kernel: [206085.986457] g0: ffff8003df8e0000 g1: 0000000000000000 g2: 0000000000000007 g3: 0000000000000000 Jul 18 20:04:02 stadler kernel: [206085.986466] g4: ffff80069f66de00 g5: ffff8007fe99e000 g6: ffff8003df8e0000 g7: 000000000000000e Jul 18 20:04:02 stadler kernel: [206085.986473] o0: 0000000000b947f0 o1: 0000000000000064 o2: 0000000000bd5d90 o3: 0000000000b94898 Jul 18 20:04:02 stadler kernel: [206085.986482] o4: 0000000000baacf0 o5: 0000000000baacf0 sp: ffff8003df8e1871 ret_pc: 0000000000634618 Jul 18 20:04:02 stadler kernel: [206085.986493] RPC: <usercopy_abort+0x78/0xa0> Jul 18 20:04:02 stadler kernel: [206085.986503] l0: ffff8003df8e0110 l1: 0000000000000022 l2: ffff8003df8e0000 l3: ffff8003df8e03c8 Jul 18 20:04:02 stadler kernel: [206085.986511] l4: 0000000000000000 l5: 0000000000000400 l6: ffff8003df8e0000 l7: 0000000011001005 Jul 18 20:04:02 stadler kernel: [206085.986519] i0: 0000000000b94898 i1: 0000000000baacf0 i2: 0000000000000001 i3: 0000000000000000 Jul 18 20:04:02 stadler kernel: [206085.986527] i4: 0000000000000080 i5: ffffffffffffffff i6: ffff8003df8e1941 i7: 00000000006347e4 Jul 18 20:04:02 stadler kernel: [206085.986539] I7: <__check_object_size+0x1a4/0x220> Jul 18 20:04:02 stadler kernel: [206085.986544] Call Trace: Jul 18 20:04:02 stadler kernel: [206085.986557] [00000000006347e4] __check_object_size+0x1a4/0x220 Jul 18 20:04:02 stadler kernel: [206085.986578] [000000000042dfb0] synchronize_user_stack+0xb0/0x180 Jul 18 20:04:02 stadler kernel: [206085.986592] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986605] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986617] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986630] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986643] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986655] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986668] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986680] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986693] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986705] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986718] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986730] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986743] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986756] [000000000042dfc4] synchronize_user_stack+0xc4/0x180 >> Looks like a infinite stack recursion in kernel, no? Jul 18 20:04:02 stadler kernel: [206085.986761] Disabling lock debugging due to kernel taint Jul 18 20:04:02 stadler kernel: [206085.986776] Caller[00000000006347e4]: __check_object_size+0x1a4/0x220 Jul 18 20:04:02 stadler kernel: [206085.986790] Caller[000000000042dfb0]: synchronize_user_stack+0xb0/0x180 Jul 18 20:04:02 stadler kernel: [206085.986803] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986817] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986830] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986843] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986856] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986868] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986881] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986894] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986907] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986920] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986932] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986945] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986958] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986971] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986984] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.986996] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987009] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987022] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987035] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987047] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987060] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987073] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987086] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987099] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987112] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987125] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987137] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987150] Caller[000000000042dfc4]: synchronize_user_stack+0xc4/0x180 Jul 18 20:04:02 stadler kernel: [206085.987154] Instruction DUMP: Jul 18 20:04:02 stadler kernel: [206085.987160] 92102064 Jul 18 20:04:02 stadler kernel: [206085.987165] 7ff7d0fa Jul 18 20:04:02 stadler kernel: [206085.987170] 901223f0 Jul 18 20:04:02 stadler kernel: [206085.987175] <91d02005> Jul 18 20:04:02 stadler kernel: [206085.987180] 33002eab Jul 18 20:04:02 stadler kernel: [206085.987185] b21660f0 Jul 18 20:04:02 stadler kernel: [206085.987190] 98100019 Jul 18 20:04:02 stadler kernel: [206085.987195] 106ffff0 Jul 18 20:04:02 stadler kernel: [206085.987200] 82100019 Jul 18 20:04:02 stadler kernel: [206085.987204] Jul 18 20:04:02 stadler kernel: [206106.996141] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:04:02 stadler kernel: [206106.996331] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=2625 Jul 18 20:04:02 stadler kernel: [206106.996432] (detected by 12, t=5252 jiffies, g=1579173, c=1579172, q=520) Jul 18 20:04:02 stadler kernel: [206106.996702] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:04:02 stadler kernel: [206106.996709] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:05:05 stadler kernel: [206170.010519] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:05:05 stadler kernel: [206170.010711] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=10495 Jul 18 20:05:05 stadler kernel: [206170.010812] (detected by 8, t=21007 jiffies, g=1579173, c=1579172, q=2477) Jul 18 20:05:05 stadler kernel: [206170.011032] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:05:05 stadler kernel: [206170.011040] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:06:08 stadler kernel: [206233.024894] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:06:08 stadler kernel: [206233.025085] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=18067 Jul 18 20:06:08 stadler kernel: [206233.025187] (detected by 13, t=36762 jiffies, g=1579173, c=1579172, q=2779) Jul 18 20:06:08 stadler kernel: [206233.025455] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:06:08 stadler kernel: [206233.025463] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:07:11 stadler kernel: [206296.039270] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:07:11 stadler kernel: [206296.039462] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=25566 Jul 18 20:07:11 stadler kernel: [206296.039564] (detected by 5, t=52517 jiffies, g=1579173, c=1579172, q=3064) Jul 18 20:07:11 stadler kernel: [206296.039832] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:07:11 stadler kernel: [206296.039840] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:08:14 stadler kernel: [206359.053646] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:08:14 stadler kernel: [206359.053839] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=33228 Jul 18 20:08:14 stadler kernel: [206359.053941] (detected by 29, t=68272 jiffies, g=1579173, c=1579172, q=3758) Jul 18 20:08:14 stadler kernel: [206359.054276] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:08:14 stadler kernel: [206359.054283] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:09:17 stadler kernel: [206422.068021] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:09:17 stadler kernel: [206422.068231] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=40717 Jul 18 20:09:17 stadler kernel: [206422.068354] (detected by 0, t=84027 jiffies, g=1579173, c=1579172, q=4405) Jul 18 20:09:17 stadler kernel: [206422.068682] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:09:17 stadler kernel: [206422.068689] TPC[0] O7[0] I7[0] RPC[0] Jul 18 20:10:20 stadler kernel: [206485.082397] INFO: rcu_sched detected stalls on CPUs/tasks: Jul 18 20:10:20 stadler kernel: [206485.082604] 25-...0: (1 GPs behind) idle=276/1/4611686018427387904 softirq=1454807/1454808 fqs=47985 Jul 18 20:10:20 stadler kernel: [206485.082726] (detected by 13, t=99782 jiffies, g=1579173, c=1579172, q=6290) Jul 18 20:10:20 stadler kernel: [206485.083053] CPU[ 25]: TSTATE[0000000000000000] TPC[0000000000000000] TNPC[0000000000000000] TASK[NULL:-1] Jul 18 20:10:20 stadler kernel: [206485.083060] TPC[0] O7[0] I7[0] RPC[0] >> Nevertheless, it seems that the kernel stayed up for 6 minutes past the kernel bug, >> so maybe it was generated already on my first run of tweaklib2? Jul 18 21:11:25 stadler kernel: [ 0.000029] PROMLIB: Sun IEEE Boot Prom 'OBP 4.30.4.e 2013/09/23 16:08' Jul 18 21:11:25 stadler kernel: [ 0.000037] PROMLIB: Root node compatible: sun4v Jul 18 21:11:25 stadler kernel: [ 0.000102] Linux version 4.16.0-1-sparc64-smp (debian-kernel@xxxxxxxxxxxxxxxx) (gcc version 7.3.0 (Debian 7.3.0-17)) #1 SMP Debian 4.16.5-1 (2018-04-29) Adrian, maybe it is best if you try as root to reproduce the crash, copied the two required file into /root/kernel-crash-test directory. Pierre Le 18/07/2018 à 23:06, John Paul Adrian Glaubitz a écrit : > Hi Pierre! > > On 07/18/2018 10:25 PM, Pierre Muller wrote: >> Maybe I was finally successful: >> >> stadler seems to have come into troubles after runing a single (faulty) executable >> >> the executable is in >> /home/pierre/pas/test directory >> >> its name is >> ./tweaklib2 >> >> it uses a (probably faulty) library called >> ./libtweaklib1.so >> >> both are 64-bit executable generated with the current Free Pascal compiler >> on my account, located in directory >> /home/pierre/pas/fpc-3.1.1/bin/ppcsparc64 (cross compiler from 32 to 64 bit) >> >> Last things I did was >> recompiling those two sources, using >> >> pierre@stadler:~/pas/test$ ppcsparc64 -gl tweaklib1.pp -k-rpath -k. -Cg -n -Fu../trunk/fpcsrc/rtl/units/sparc64-linux >> pierre@stadler:~/pas/test$ ppcsparc64 -gl tweaklib2.pp -k-rpath -k. -Cg -n -Fu../trunk/fpcsrc/rtl/units/sparc64-linux >> >> Running simply ./tweaklib2 >> generated a SIGILL which stopped the program without any core generation... >> >> Running it inside gdb did not help, >> SIGILL was not caught by GDB :-( >> >> Running it with strace generated this message: >> rt_sigaction(SIGBUS, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 0xffff800100134ca8, 8) = 0 >> rt_sigaction(SIGILL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, NULL, 0xffff800100134ca8, 8) = 0 >> >> Message from syslogd@stadler at Jul 18 20:04:02 ... >> kernel:[206085.986213] usercopy: Kernel memory exposure attempt detected from process stack (offset 0, size 128)! > > Great. Looks like we have a reproducer for the kernel developers now :). > >> I tried to access via a second ssh session to the machine, >> but the session froze also, and I finally got completely kicked out, >> probably due to a system shutdown... >> >> This kernel crash might be easier to reproduce than any of what I got before ... >> >> I am still unable to reconnect to the machine, >> so maybe someone can resart the machine, >> and test if running the tweaklib2 in /home/pierre/pas/test directory >> several times or with strace is sufficient to get a 'reproducible' crash. >> >> In the hope that this will help, > Just did a hardware reset, should be back shortly. > > Adrian > -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html