Re: rcutorture’s init segfaults in ppc64le VM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dear Zhouyi,


Thank you for still looking into this.


Am 10.03.22 um 03:37 schrieb Zhouyi Zhou:

I try to reproduce the bug in ppc64 VM in Oregon State University
using the vmlinux extracted from
https://owww.molgen.mpg.de/~pmenzel/rcutorture-2022.02.01-21.52.37-torture-locktorture-kasan-lock01.tar.xz

the ppc64 VM in which I run the qemu without hardware acceleration is:
Linux version 5.4.0-100-generic (buildd@bos02-ppc64el-021) (gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)) #113-Ubuntu SMP Thu Feb 3 18:43:11 UTC 2022 (Ubuntu 5.4.0-100.113-generic 5.4.166)


The qemu command I use to test:
cd /tmp/dev/shm/linux/tools/testing/selftests/rcutorture/res/2022.02.01-21.52.37-torture/results-locktorture-kasan/LOCK01$
$qemu-system-ppc64   -nographic -smp cores=2,threads=1 -net none -M
pseries -nodefaults -device spapr-vscsi -serial file:/tmp/console.log
-m 512 -kernel ./vmlinux -append "debug_boot_weak_hash panic=-1
console=ttyS0 rcutorture.onoff_interval=200
rcutorture.onoff_holdoff=30 rcutree.gp_preinit_delay=12
rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3
rcutree.kthread_prio=2 threadirqs tree.use_softirq=0
rcutorture.n_barrier_cbs=4 rcutorture.stat_interval=15
rcutorture.shutdown_secs=1800 rcutorture.test_no_idle_hz=1
rcutorture.verbose=1"

The console.log is uploaded to:
http://154.223.142.244/logs/20220310/console.paul.log
The log tells us it is illegal instruction that causes the trouble:
[    4.246387][    T1] init[1]: illegal instruction (4) at 1002c308 nip 1002c308 lr 10001684 code 1 in init[10000000+d0000]
[    4.251400][    T1] init[1]: code: f90d88c0 f92a0008 f9480008 7c2004ac 2c2d0000 f9490000 386d88d0 380000e8
[    4.253416][    T1] init[1]: code: 41820098 e92d8f98 75290010 4182008c <44000001> 2c2d0000 60000000 8902f438


Meanwhile, the vmlinux compiled by myself runs smoothly.

How did you build it? Using GCC or clang? I forgot, if the problem was only reproducible if the host Linux kernel was built with clang or the VM kernel.

Then I modify mkinitrd.sh to let it panic manually:
http://154.223.142.244/logs/20220310/mkinitrd.sh

I only see the change:

    -
    +	int *ptr = 0;
    +	*ptr =  0;

The log tells us it is a segfault (instead of a illegal instruction):
http://154.223.142.244/logs/20220310/console.zhouyi.log

Then I use gdb to debug the init in host:
ubuntu@zhouzhouyi-1:~/newkernel/linux-next$ gdb
tools/testing/selftests/rcutorture/initrd/init
(gdb) run
Starting program:
/home/ubuntu/newkernel/linux-next/tools/testing/selftests/rcutorture/initrd/init

Program received signal SIGSEGV, Segmentation fault.
0x0000000010000b2c in ?? ()
(gdb) x/10i $pc
=> 0x10000b2c:    stw     r9,0(r9)
    0x10000b30:    trap
    0x10000b34:    .long 0x0
    0x10000b38:    .long 0x0
    0x10000b3c:    .long 0x0
    0x10000b40:    lis     r2,4110
    0x10000b44:    addi    r2,r2,31488
    0x10000b48:    mr      r9,r1
    0x10000b4c:    rldicr  r1,r1,0,59
    0x10000b50:    li      r0,0
(gdb) p $r9
$1 = 0
(gdb) x/30x $pc - 0x30
0x10000afc:    0x38840040    0x387f0040    0xf8010040    0x48026919
0x10000b0c:    0x60000000    0xe8010040    0x7c0803a6    0x4bffff24
0x10000b1c:    0x00000000    0x01000000    0x00000180    0x39200000
0x10000b2c:    0x91290000    0x7fe00008    0x00000000    0x00000000
which matches the hex content of
http://154.223.142.244/logs/20220310/console.zhouyi.log:
[    5.077431][    T1] init[1]: segfault (11) at 0 nip 10000b2c lr 10001024 code 1 in init[10000000+d0000]
[    5.087167][    T1] init[1]: code: 38840040 387f0040 f8010040 48026919 60000000 e8010040 7c0803a6 4bffff24
[    5.093987][    T1] init[1]: code: 00000000 01000000 00000180 39200000 <91290000> 7fe00008 00000000 00000000


Conclusions: there might be something wrong when packing the init into
vmlinux in your environment.

I will continue to do research on this interesting problem with you.

As written I think it’s a problem with LLVM/clang. Unfortunately, I won’t be able to retest before next week.


Kind regards,

Paul



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux