Re: FYI, a few new KCSAN splats

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks Paul and Marco for giving me valuable advice.

I studied the structure of kvm.sh and I found enabling
CONFIG_KCSAN_STRICT and lower kcsan.skip_watch will make the KCSAN
detect RCU's data-race within half an hour (I tried this for two
times).
Following is my command:
linux-rcu$git config --get remote.origin.url
https://kernel.source.codeaurora.cn/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/
linux-rcu$tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus
--duration 10h --configs "TREE01" --kcsan --kmake-arg "CC=clang-12"
--trust-make --kconfig CONFIG_KCSAN_STRICT=y --bootargs
kcsan.skip_watch=1000


[ 1289.504892] rcu: De-offloading 5^M
[ 1289.761495] rcu: De-offloading 7^M
[ 1289.763556] ==================================================================^M
[ 1289.764531] BUG: KCSAN: data-race in call_rcu / rcu_nocb_cb_kthread^M
[ 1289.765388] ^M
[ 1289.765599] read (marked) to 0xffff94f2db7ec608 of 1 bytes by
interrupt on cpu 7:^M
[ 1289.766588]  call_rcu+0xad/0xfd0^M
[ 1289.767035]  rcu_torture_timer+0xdc/0xf0^M
[ 1289.767566]  call_timer_fn+0x87/0x150^M
[ 1289.768065]  __run_timers+0x2fd/0x450^M
[ 1289.768562]  run_timer_softirq+0x29/0x60^M
[ 1289.769093]  __do_softirq+0x1d9/0x35c^M
[ 1289.769540]  __irq_exit_rcu+0x13d/0x150^M
[ 1289.770047]  irq_exit_rcu+0x5/0x20^M
[ 1289.770520]  sysvec_apic_timer_interrupt+0x8d/0xb0^M
[ 1289.771168]  asm_sysvec_apic_timer_interrupt+0x12/0x20^M
[ 1289.771854]  _raw_spin_unlock_irqrestore+0x33/0x50^M
[ 1289.772499]  rdp_offload_toggle+0xc9/0x150^M
[ 1289.773050]  rcu_nocb_rdp_deoffload+0x163/0x2c0^M
[ 1289.773670]  work_for_cpu_fn+0x28/0x40^M
[ 1289.774185]  process_one_work+0x50f/0x800^M
[ 1289.774726]  worker_thread+0x71c/0x9f0^M
[ 1289.775237]  kthread+0x23e/0x270^M
[ 1289.775675]  ret_from_fork+0x22/0x30^M
[ 1289.776162] ^M
[ 1289.776377] 3 locks held by kworker/7:2/294:^M
[ 1289.776955]  #0: ffff94f2c1050738
((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x49c/0x800^M
[ 1289.778245]  #1: ffffa5434093bde0
((work_completion)(&wfc.work)){+.+.}-{0:0}, at:
process_one_work+0x4cd/0x800^M
[ 1289.779605]  #2: ffffa5434029cdf8 ((&t)){+.-.}-{0:0}, at:
call_timer_fn+0x51/0x150^M
[ 1289.780666] irq event stamp: 215649^M
[ 1289.781186] hardirqs last  enabled at (215648):
[<ffffffff8d6afec2>] __local_bh_enable_ip+0xc2/0x120^M
[ 1289.782471] hardirqs last disabled at (215649):
[<ffffffff8d77c6a3>] call_rcu+0x73/0xfd0^M
[ 1289.783632] softirqs last  enabled at (214628):
[<ffffffff8d6b03dd>] __irq_exit_rcu+0x13d/0x150^M
[ 1289.784866] softirqs last disabled at (215635):
[<ffffffff8d6b03dd>] __irq_exit_rcu+0x13d/0x150^M
[ 1289.786147] ^M
[ 1289.786379] write to 0xffff94f2db7ec608 of 1 bytes by task 63 on cpu 15:^M
[ 1289.787342]  rcu_nocb_cb_kthread+0x2f7/0x740^M
[ 1289.787970]  kthread+0x23e/0x270^M
[ 1289.788450]  ret_from_fork+0x22/0x30^M
[ 1289.788964] ^M

The reader is at
0xffffffff8117c6dd: file kernel/rcu/tree.c, line 2991
2986        local_irq_save(flags);
2987        kasan_record_aux_stack(head);
2988        rdp = this_cpu_ptr(&rcu_data);
2989
2990        /* Add the callback to our list. */
2991        if (unlikely(!rcu_segcblist_is_enabled(&rdp->cblist))) {
2992            // This can trigger due to call_rcu() from offline CPU:
2993            WARN_ON_ONCE(rcu_scheduler_active != RCU_SCHEDULER_INACTIVE);
2994            WARN_ON_ONCE(!rcu_is_watching());
2995            // Very early boot, before rcu_init().  Initialize if needed

The writer is at
0xffffffff81191977: file kernel/rcu/rcu_segcblist.h, line 65
60    }
61
62    static inline void rcu_segcblist_clear_flags(struct rcu_segcblist *rsclp,
63                             int flags)
64    {
65        rsclp->flags &= ~flags;
66    }
67

I gained a lot during the study ;-)

Many thanks
Zhouyi

On Wed, Dec 15, 2021 at 1:59 AM Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
>
> Thank Marco for giving me such valuable advice.
>
> Both RCU and KCSAN are very attractive and amazing to me, I have
> carefully read your advice and will strive to acquire expertise in the
> following days.
>
> Thanks again
> Zhouyi
>
> On Wed, Dec 15, 2021 at 12:57 AM Marco Elver <elver@xxxxxxxxxx> wrote:
> >
> > On Tue, 14 Dec 2021 at 09:38, Zhouyi Zhou <zhouzhouyi@xxxxxxxxx> wrote:
> > [...]
> > > 2) The KCSAN kernel compile with clang-12:
> > > CONFIG_HAVE_ARCH_KCSAN=y
> > > CONFIG_HAVE_KCSAN_COMPILER=y
> > > CONFIG_KCSAN=y
> > > CONFIG_KCSAN_SELFTEST=y
> > > # CONFIG_KCSAN_EARLY_ENABLE is not set
> > > CONFIG_KCSAN_NUM_WATCHPOINTS=64
> > > CONFIG_KCSAN_UDELAY_TASK=80
> > > CONFIG_KCSAN_UDELAY_INTERRUPT=20
> > > CONFIG_KCSAN_DELAY_RANDOMIZE=y
> > > CONFIG_KCSAN_SKIP_WATCH=4000
> > > CONFIG_KCSAN_SKIP_WATCH_RANDOMIZE=y
> > > # CONFIG_KCSAN_INTERRUPT_WATCHER is not set
> > > CONFIG_KCSAN_REPORT_ONCE_IN_MS=3000
> > > # CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN is not set
> > > # CONFIG_KCSAN_STRICT is not set
> > > CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=y
> > > # CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC is not set
> > > CONFIG_KCSAN_IGNORE_ATOMICS=y
> > > CONFIG_KCSAN_PERMISSIVE=y
> >
> > Note, rcutorture selects CONFIG_KCSAN_STRICT=y, which may be one part
> > of the issue.  The other is that the -rcu tree has KCSAN enhancements
> > not yet available in mainline, which are enabled together with
> > CONFIG_KCSAN_STRICT, and make KCSAN slightly more aggressive (i.e.
> > finds data races faster) as well. Although you can also locally make
> > KCSAN a bit more aggressive by lowering CONFIG_KCSAN_SKIP_WATCH (boot
> > param kcsan.skip_watch=), to e.g. 2000-3000, but too low and you'll
> > hurt forward-progress.
> >
> > -- Marco



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux