Re: spidev regression in 6.2-rc kernel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 16, 2023 at 2:23 PM Max Krummenacher
<max.krummenacher@xxxxxxxxxxx> wrote:
>
> Hi
>
> On Mon, 2023-01-16 at 13:47 +0100, Bartosz Golaszewski wrote:
> > On Mon, 16 Jan 2023 at 13:19, Bartosz Golaszewski <brgl@xxxxxxxx> wrote:
> > >
> > > On Mon, Jan 16, 2023 at 1:06 PM Francesco Dolcini <francesco@xxxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > > we spotted a regression on spidev on latest 6.2-rc kernel.
> > > >
> > > > [  214.047619]
> > > > [  214.049198] ============================================
> > > > [  214.054533] WARNING: possible recursive locking detected
> > > > [  214.059858] 6.2.0-rc3-0.0.0-devel+git.97ec4d559d93 #1 Not tainted
> > > > [  214.065969] --------------------------------------------
> > > > [  214.071290] spidev_test/1454 is trying to acquire lock:
> > > > [  214.076530] c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x8e0/0xab8
> > > > [  214.084164]
> > > > [  214.084164] but task is already holding lock:
> > > > [  214.090007] c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x44/0xab8
> > > > [  214.097537]
> > > > [  214.097537] other info that might help us debug this:
> > > > [  214.104075]  Possible unsafe locking scenario:
> > > > [  214.104075]
> > > > [  214.110004]        CPU0
> > > > [  214.112461]        ----
> > > > [  214.114916]   lock(&spidev->spi_lock);
> > > > [  214.118687]   lock(&spidev->spi_lock);
> > > > [  214.122457]
> > > > [  214.122457]  *** DEADLOCK ***
> > > > [  214.122457]
> > > > [  214.128386]  May be due to missing lock nesting notation
> > > > [  214.128386]
> > > > [  214.135183] 2 locks held by spidev_test/1454:
> > > > [  214.139553]  #0: c4925dbc (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x44/0xab8
> > > > [  214.147524]  #1: c4925e14 (&spidev->buf_lock){+.+.}-{3:3}, at: spidev_ioctl+0x70/0xab8
> > > > [  214.155493]
> > > > [  214.155493] stack backtrace:
> > > > [  214.159861] CPU: 0 PID: 1454 Comm: spidev_test Not tainted 6.2.0-rc3-0.0.0-devel+git.97ec4d559d93 #1
> > > > [  214.169012] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
> > > > [  214.175555]  unwind_backtrace from show_stack+0x10/0x14
> > > > [  214.180819]  show_stack from dump_stack_lvl+0x60/0x90
> > > > [  214.185900]  dump_stack_lvl from __lock_acquire+0x874/0x2858
> > > > [  214.191584]  __lock_acquire from lock_acquire+0xfc/0x378
> > > > [  214.196918]  lock_acquire from __mutex_lock+0x9c/0x8a8
> > > > [  214.202083]  __mutex_lock from mutex_lock_nested+0x1c/0x24
> > > > [  214.207597]  mutex_lock_nested from spidev_ioctl+0x8e0/0xab8
> > > > [  214.213284]  spidev_ioctl from sys_ioctl+0x4d0/0xe2c
> > > > [  214.218277]  sys_ioctl from ret_fast_syscall+0x0/0x1c
> > > > [  214.223351] Exception stack(0xe75cdfa8 to 0xe75cdff0)
> > > > [  214.228422] dfa0:                   00000000 00001000 00000003 40206b00 bee266e8 bee266e0
> > > > [  214.236617] dfc0: 00000000 00001000 006a71a0 00000036 004c0040 004bfd18 00000000 00000003
> > > > [  214.244809] dfe0: 00000036 bee266c8 b6f16dc5 b6e8e5f6
> > > >
> > > >
> > > > This is not running the latest rc4, but on sha 97ec4d559d93 (this is
> > > > just what our CI had available when this test was run). I was not able
> > > > to bisect it, but it seems something that you could have introduced.
> > > >
> > > > The log is from an apalis-imx6, but I have the same on other ARM SOC.
> > > >
> > > > Can you have a look?
> > > > Thanks
> > > >
> > > > Francesco
> > > >
> > >
> > > Eek! Yes it's commit 1f4d2dd45b6e ("spi: spidev: fix a race condition
> > > when accessing spidev->spi"): spidev_ioctl() takes the lock and in
> > > certain instances can end up calling spidev_compat_ioc_message() which
> > > takes the same lock again. I'll send a fix shortly.
> > >
> > > Bart
> >
> > Seems, like that's not it... Francesco: what is the output of:
> >
> > ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x44/0xab8
> >
> > and
> >
> > ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x44/0xab8
> >
> > on the spidev.o object for that build?
> >
> > Bart
> >
> I did this for Francesco.
>
> The CI build isn't configured for this:
> ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x44/0xab8
> ERROR: CONFIG_DEBUG_INFO not enabled
>
> So I recompiled the kernel and rerun the test.
> [  146.916497] ============================================
> [  146.921821] WARNING: possible recursive locking detected
> [  146.927143] 6.2.0-rc3-00349-g97ec4d559d93 #4 Not tainted
> [  146.932467] --------------------------------------------
> [  146.937786] spidev_test/558 is trying to acquire lock:
> [  146.942935] c4c79c3c (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x89c/0xa78
> [  146.950565]
> [  146.950565] but task is already holding lock:
> [  146.956405] c4c79c3c (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x40/0xa78
> [  146.963932]
> [  146.963932] other info that might help us debug this:
> [  146.970467]  Possible unsafe locking scenario:
> [  146.970467]
> [  146.976394]        CPU0
> [  146.978849]        ----
> [  146.981302]   lock(&spidev->spi_lock);
> [  146.985071]   lock(&spidev->spi_lock);
> [  146.988838]
> [  146.988838]  *** DEADLOCK ***
> [  146.988838]
> [  146.994765]  May be due to missing lock nesting notation
> [  146.994765]
> [  147.001561] 2 locks held by spidev_test/558:
> [  147.005842]  #0: c4c79c3c (&spidev->spi_lock){+.+.}-{3:3}, at: spidev_ioctl+0x40/0xa78
> [  147.013808]  #1: c4c79c94 (&spidev->buf_lock){+.+.}-{3:3}, at: spidev_ioctl+0x6c/0xa78
> [  147.021774]
> [  147.021774] stack backtrace:
> [  147.026141] CPU: 0 PID: 558 Comm: spidev_test Not tainted 6.2.0-rc3-00349-g97ec4d559d93 #4
> [  147.034419] Hardware name: Freescale i.MX7 Dual (Device Tree)
> [  147.040180]  unwind_backtrace from show_stack+0x10/0x14
> [  147.045433]  show_stack from dump_stack_lvl+0x58/0x70
> [  147.050506]  dump_stack_lvl from __lock_acquire+0x13cc/0x28e0
> [  147.056279]  __lock_acquire from lock_acquire+0xf4/0x368
> [  147.061616]  lock_acquire from __mutex_lock+0x80/0x8a0
> [  147.066783]  __mutex_lock from mutex_lock_nested+0x1c/0x24
> [  147.072295]  mutex_lock_nested from spidev_ioctl+0x89c/0xa78
> [  147.077983]  spidev_ioctl from sys_ioctl+0x540/0xddc
> [  147.082972]  sys_ioctl from ret_fast_syscall+0x0/0x1c
> [  147.088043] Exception stack(0xf1159fa8 to 0xf1159ff0)
> [  147.093111] 9fa0:                   00000000 00001000 00000003 40206b00 befe3a28 befe3a20
> [  147.101303] 9fc0: 00000000 00001000 014b01a0 00000036 00420040 0041fd18 00000000 00000003
> [  147.109493] 9fe0: 00000036 befe3a08 b6e86dc5 b6dfe5f6
>
> Using the changed code layout from this warning I get:
>
> $ ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x89c/0xa78
> spidev_ioctl+0x89c/0xa78:
> spidev_sync at /home/krm/git.toradex.com/linux-toradex-mainline/drivers/spi/spidev.c:99
> (inlined by) spidev_message at /home/krm/git.toradex.com/linux-toradex-mainline/drivers/spi/spidev.c:297
> (inlined by) spidev_ioctl at /home/krm/git.toradex.com/linux-toradex-mainline/drivers/spi/spidev.c:504
>
> $ ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x40/0xa78
> spidev_ioctl+0x40/0xa78:
> spidev_ioctl at /home/krm/git.toradex.com/linux-toradex-mainline/drivers/spi/spidev.c:363
>
> $ ./scripts/faddr2line drivers/spi/spidev.o spidev_ioctl+0x6c/0xa78
> spidev_ioctl+0x6c/0xa78:
> spidev_ioctl at /home/krm/git.toradex.com/linux-toradex-mainline/drivers/spi/spidev.c:377
>
> Regards
> Max

Thanks Max, that makes sense now, working on a fix.

Bart



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux