Re: [PATCH] asm-generic/mmiowb: Get cpu in mmiowb_set_pending

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 5 Jul 2020 at 17:52, Guo Ren <guoren@xxxxxxxxxx> wrote:
> On Sun, Jul 5, 2020 at 11:03 PM Emil Renner Berthing <kernel@xxxxxxxx> wrote:
> > On Sun, 5 Jul 2020 at 16:44, Guo Ren <guoren@xxxxxxxxxx> wrote:
> > >
> > > Hi Emil,
> > >
> > > On Sun, Jul 5, 2020 at 10:27 PM Emil Renner Berthing <kernel@xxxxxxxx> wrote:
> > > >
> > > > Without this enabling CONFIG_PREEMPT and CONFIG_DEBUG_PREEMPT
> > > > results in many errors like this on the HiFive Unleashed
> > > > RISC-V board:
> > > >
> > > > BUG: using smp_processor_id() in preemptible [00000000] code: swapper/0/1
> > > > caller is regmap_mmio_write32le+0x1c/0x46
> > > > CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.8.0-rc3-hfu+ #1
> > > > Call Trace:
> > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000001bea>] kernel_init_freeable+0x152/0x1da
> > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > [<ffffffe0005c4d36>] kernel_init+0xa/0x11a
> > > > [<ffffffe0005c4d28>] rest_init+0xde/0xe2
> > > > [<ffffffe000200ff6>] ret_from_syscall_rejected+0x8/0xc
> > > >
> > > > Signed-off-by: Emil Renner Berthing <kernel@xxxxxxxx>
> > > > ---
> > > > This patch fixes it, but my guess is that it's not the right
> > > > fix. Do anyone have a better idea?
> > > >
> > > >  include/asm-generic/mmiowb.h | 6 +++++-
> > > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/asm-generic/mmiowb.h b/include/asm-generic/mmiowb.h
> > > > index 9439ff037b2d..31a21cdfbbcf 100644
> > > > --- a/include/asm-generic/mmiowb.h
> > > > +++ b/include/asm-generic/mmiowb.h
> > > > @@ -34,8 +34,12 @@ DECLARE_PER_CPU(struct mmiowb_state, __mmiowb_state);
> > > >
> > > >  static inline void mmiowb_set_pending(void)
> > > >  {
> > > > -       struct mmiowb_state *ms = __mmiowb_state();
> > > > +       struct mmiowb_state *ms;
> > > > +
> > > > +       get_cpu();
> > > > +       ms = __mmiowb_state();
> > > >         ms->mmiowb_pending = ms->nesting_count;
> > > > +       put_cpu();
> > > >  }
> > >
> > > #define __mmiowb_state()        this_cpu_ptr(&__mmiowb_state)
> > >
> > > The ptr is a fixed address, so don't worry about the change, and just
> > > use an atomic_read is enough.
> > > static inline void mmiowb_set_pending(void)
> > > {
> > >         struct mmiowb_state *ms = __mmiowb_state();
> > > -       ms->mmiowb_pending = ms->nesting_count;
> > > +      ms->mmiowb_pending = atomic_read(ms->nesting_count);
> > > }
> >
> > You may be right, but it doesn't fix the BUG. As far as I can tell it
> > happens in __mmiowb_state() which expands through this_cpu_ptr and
> > arch_raw_cpu_ptr to SHIFT_PERCPU_PTR(ptr, __my_cpu_offset), where
> > __my_cpu_offset is per_cpu_offset(smp_processor_id()) and with
> > CONFIG_DEBUG_PREEMPT smp_processor_id is actually
> > debug_smp_processor_id, which eventually checks that preemption is
> > disabled in check_preemption_disabled.
> Thx for explaining.
>
> Seems we need to find who disable preemption during:
> > > > [<ffffffe000201f6e>] walk_stackframe+0x0/0x7a
> > > > [<ffffffe0005b290e>] dump_stack+0x6e/0x88
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0005c4c26>] check_preemption_disabled+0xa4/0xaa
> > > > [<ffffffe00047365e>] regmap_mmio_write32le+0x18/0x46
> > > > [<ffffffe0004737c8>] regmap_mmio_write+0x26/0x44
> > > > [<ffffffe0004715c4>] regmap_write+0x28/0x48
> > > > [<ffffffe00043dccc>] sifive_gpio_probe+0xc0/0x1da
> > > > [<ffffffe00000113e>] rdinit_setup+0x22/0x26
> > > > [<ffffffe000469054>] platform_drv_probe+0x24/0x52
> > > > [<ffffffe000467e16>] really_probe+0x92/0x21a
> > > > [<ffffffe0004683a8>] device_driver_attach+0x42/0x4a
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe0004683f0>] __driver_attach+0x40/0xac
> > > > [<ffffffe0004683ac>] device_driver_attach+0x46/0x4a
> > > > [<ffffffe000466a3e>] bus_for_each_dev+0x3c/0x64
> > > > [<ffffffe000467118>] bus_add_driver+0x11e/0x184
> > > > [<ffffffe00046889a>] driver_register+0x32/0xc6
> > > > [<ffffffe00000e5ac>] gpiolib_sysfs_init+0xaa/0xae
> > > > [<ffffffe0000019ec>] do_one_initcall+0x50/0xfc

Hmm.. the problem is that preemption is *not* disabled when
smp_processor_id is called, right?

> do_one_initcall's preempt_count = 0
> (gdb) bt
> #0  do_one_initcall (fn=0xffffffe000003b0e
> <trace_init_flags_sys_exit>) at init/main.c:1190
> #1  0xffffffe000001f20 in do_pre_smp_initcalls () at ./include/linux/init.h:131
>
> #2  kernel_init_freeable () at init/main.c:1494
> #3  0xffffffe0009d6ea6 in kernel_init (unused=<optimized out>) a
>    t init/main.c:1399
> #4  0xffffffe000201c2a in handle_exception () at arch/riscv/kernel/entry.S:188
> Backtrace stopped: frame did not save the PC
> (gdb) p *(struct task_struct*)$tp
> $2 = {thread_info = {flags = 0, preempt_count = 0,
>
> Can you debug like this ? to see which function's preempt_count = 0 in
> your backtrace.

I'm sorry, I'm not exactly sure what you mean by this, but it seems to
happen multiple places in writel-like functions. Both at probe time in
sifive_gpio_probe and sifive_spi_probe and when the spi driver is
running. Here is the full log:

http://sprunge.us/rgfC3r

/Emil



[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux