Re: [PATCH v2 05/10] spi: bcm2835: Drop dma_pending flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11.09.2019 12:15:30, Lukas Wunner wrote:
> The BCM2835 SPI driver uses a flag to keep track of whether a DMA
> transfer is in progress.
>
> The flag is used to avoid terminating DMA channels multiple times if a
> transfer finishes orderly while simultaneously the SPI core invokes the
> ->handle_err() callback because the transfer took too long.  However
> terminating DMA channels multiple times is perfectly fine, so the flag
> is unnecessary for this particular purpose.
>
> The flag is also used to avoid invoking bcm2835_spi_undo_prologue()
> multiple times under this race condition.  However multiple *concurrent*
> invocations can no longer happen since commit 2527704d8411 ("spi:
> bcm2835: Synchronize with callback on DMA termination") because the
> ->handle_err() callback now uses the _sync() variant when terminating
> DMA channels.
>
> The only raison d'être of the flag is therefore that
> bcm2835_spi_undo_prologue() cannot cope with multiple *sequential*
> invocations.  Achieve that by setting tx_prologue to 0 at the end of
> the function.  Subsequent invocations thus become no-ops.
>
> With that, the dma_pending flag becomes unnecessary, so drop it.
>
> Tested-by: Nuno Sá <nuno.sa@xxxxxxxxxx>
> Tested-by: Noralf Trønnes <noralf@xxxxxxxxxxx>
> Signed-off-by: Lukas Wunner <lukas@xxxxxxxxx>
> Acked-by: Stefan Wahren <wahrenst@xxxxxxx>
> Acked-by: Martin Sperl <kernel@xxxxxxxxxxxxxxxx>

I think this patch breaks the bcm2835_spi_handle_err() function, which
may be called for non-DMA transfers, too. In my test case a non DMA, non
polling transfer runs into a timeout and produces this backtrace:

| [ 1651.800430] spidev spi3.0: SPI transfer timed out   
| [ 1651.800468] Internal error: Oops: 206 [#1] PREEMPT_RT SMP ARM
| [ 1651.800473] Modules linked in: can_raw can brcmfmac mcp251xfd can_dev spidev brcmutil bcm2835_codec(C) bcm2835_v4l2(C) v4l2_mem2mem bcm2835_isp(C) videobuf2_vmalloc bcm2835_mmal_vchiq(C) cfg80211 videobuf2_dma_contig dwc2 videobuf2_memops videobuf2_v4l2 videobuf2_common
| videodev raspberrypi_hwmon vc_sm_cma(C) rfkill roles spi_bcm2835aux spi_bcm2835 mc rpivid_mem uio_pdrv_genirq uio drm fuse drm_panel_orientation_quirks backlight ip_tables x_tables ipv6
| [ 1651.800533] CPU: 1 PID: 766 Comm: SpiCanTest Tainted: G         C        5.15.40-rt43-v7l+ #2
| [ 1651.800538] Hardware name: BCM2711
| [ 1651.800540] PC is at bcm2835_spi_handle_err+0x20/0xe0 [spi_bcm2835]
| [ 1651.800555] LR is at spi_transfer_one_message+0x54c/0x68c
| [ 1651.800566] pc : [<bf17ce9c>]    lr : [<c0a26000>]    psr: a0030113
| [ 1651.800569] sp : c2947d50  ip : c2947d70  fp : c2947d6c
| [ 1651.800572] r10: c21ce2f8  r9 : ffffff92  r8 : c1558340
| [ 1651.800574] r7 : c2947eac  r6 : 00000000  r5 : c21ce000  r4 : c21ce3c0
| [ 1651.800577] r3 : bf17ce7c  r2 : 00000000  r1 : c2947eac  r0 : c21ce000
| [ 1651.800580] Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
| [ 1651.800584] Control: 30c5383d  Table: 03049980  DAC: 55555555
| [ 1651.800587] Register r0 information: slab kmalloc-2k start c21ce000 pointer offset 0 size 2048
| [ 1651.800599] Register r1 information: non-slab/vmalloc memory
| [ 1651.800604] Register r2 information: NULL pointer
| [ 1651.800608] Register r3 information: 5-page vmalloc region starting at 0xbf17c000 allocated at load_module+0xc98/0x2bc8
| [ 1651.800618] Register r4 information: slab kmalloc-2k start c21ce000 pointer offset 960 size 2048
| [ 1651.800627] Register r5 information: slab kmalloc-2k start c21ce000 pointer offset 0 size 2048
| [ 1651.800636] Register r6 information: NULL pointer
| [ 1651.800639] Register r7 information: non-slab/vmalloc memory
| [ 1651.800643] Register r8 information: non-slab/vmalloc memory
| [ 1651.800646] Register r9 information: non-paged memory
| [ 1651.800650] Register r10 information: slab kmalloc-2k start c21ce000 pointer offset 760 size 2048
| [ 1651.800658] Register r11 information: non-slab/vmalloc memory
| [ 1651.800662] Register r12 information: non-slab/vmalloc memory
| [ 1651.800665] Process SpiCanTest (pid: 766, stack limit = 0x071eb77b)
| [ 1651.800669] Stack: (0xc2947d50 to 0xc2948000)
| [ 1651.800673] 7d40:                                     c3dada40 c4d92380 c21ce000 c2947eac
| [ 1651.800678] 7d60: c2947dcc c2947d70 c0a26000 bf17ce88 c2947db4 00000000 c3dad800 c3dad800
| [ 1651.800682] 7d80: 0147ae0e c10f7fc4 c140756f c21ce2f8 ffffe000 c140689c 00000000 c3dada40
| [ 1651.800686] 7da0: c21ce210 c21ce000 00000000 bf17d050 c2947ed0 c3dada40 c21ce2f8 c2947eac
| [ 1651.800689] 7dc0: c2947e0c c2947dd0 c0a281a0 c0a25ac0 c025643c d55cd1f7 c21ce210 c3dad800
| [ 1651.800693] 7de0: c2947eac c21ce000 c3dad800 c2947eac c21ce000 c3dada40 c21ce2f8 c0a23fc0
| [ 1651.800697] 7e00: c2947e5c c2947e10 c0a28920 c0a27dac c21ce288 c21ce230 00000000 00000000
| [ 1651.800700] 7e20: c2947e20 c2947e20 c0d398d4 d55cd1f7 c2947e5c c3dad800 c2947eac c3dad800
| [ 1651.800704] 7e40: c44987c0 c44987c4 c44987e0 000000ff c2947e74 c2947e60 c0a28a1c c0a28700
| [ 1651.800707] 7e60: c3dad800 c36459e0 c2947f14 c2947e78 bf223a5c c0a289f4 c2947ee8 c44987c4
| [ 1651.800711] 7e80: 00000001 c4d92380 c44987e0 c3dad800 c36459c0 c44987c0 bf226000 000000ff
| [ 1651.800714] 7ea0: c36459e0 c4412100 ffffe000 c4d923e0 c4d923e0 c3dad800 00000000 c0a23388
| [ 1651.800718] 7ec0: c2947e18 000000ff 00000000 ffffff92 c2947ed0 c2947ed0 00000000 c2947edc
| [ 1651.800721] 7ee0: c2947edc d55cd1f7 b6bd4cb0 40206b00 00000000 c50b2001 b6bd4cb0 c50b2000
| [ 1651.800725] 7f00: 00000003 c470a780 c2947fa4 c2947f18 c04726f4 bf223408 b6f3c330 00000193
| [ 1651.800728] 7f20: c0200244 b6bd4cf0 00000010 00000000 00000673 c08601cc c2947f74 c2947f48
| [ 1651.800732] 7f40: c02b6cf0 c08601b8 00000673 00000000 2228e32f 00000000 c2947f74 d55cd1f7
| [ 1651.800735] 7f60: b6bd4cf0 00000001 c2947fa4 c2947f78 c02c92b8 d55cd1f7 00000673 00000003
| [ 1651.800739] 7f80: 005020d0 00502000 00000036 c0200244 c2946000 00000036 00000000 c2947fa8
| [ 1651.800742] 7fa0: c0200040 c04725e0 00000003 005020d0 00000003 40206b00 b6bd4cb0 00000000
| [ 1651.800746] 7fc0: 00000003 005020d0 00502000 00000036 2202be12 b6f35010 00000000 b6bd4e90
| [ 1651.800749] 7fe0: 00502044 b6bd4c74 004f17eb b6cd6da8 400e0030 00000003 00000000 00000000
| [ 1651.800751] Backtrace:
| [ 1651.800754] [<bf17ce7c>] (bcm2835_spi_handle_err [spi_bcm2835]) from [<c0a26000>] (spi_transfer_one_message+0x54c/0x68c)
| [ 1651.800768]  r7:c2947eac r6:c21ce000 r5:c4d92380 r4:c3dada40
| [ 1651.800770] [<c0a25ab4>] (spi_transfer_one_message) from [<c0a281a0>] (__spi_pump_messages+0x400/0x930)
| [ 1651.800781]  r10:c2947eac r9:c21ce2f8 r8:c3dada40 r7:c2947ed0 r6:bf17d050 r5:00000000
| [ 1651.800783]  r4:c21ce000
| [ 1651.800785] [<c0a27da0>] (__spi_pump_messages) from [<c0a28920>] (__spi_sync+0x22c/0x2f4)
| [ 1651.800795]  r10:c0a23fc0 r9:c21ce2f8 r8:c3dada40 r7:c21ce000 r6:c2947eac r5:c3dad800
| [ 1651.800797]  r4:c21ce000
| [ 1651.800798] [<c0a286f4>] (__spi_sync) from [<c0a28a1c>] (spi_sync+0x34/0x4c)
| [ 1651.800808]  r10:000000ff r9:c44987e0 r8:c44987c4 r7:c44987c0 r6:c3dad800 r5:c2947eac
| [ 1651.800810]  r4:c3dad800

> ---
>  drivers/spi/spi-bcm2835.c | 23 ++++++++---------------
>  1 file changed, 8 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
> index f79f04ea42e5..532c58bcfd45 100644
> --- a/drivers/spi/spi-bcm2835.c
> +++ b/drivers/spi/spi-bcm2835.c
[...]
> @@ -927,11 +921,10 @@ static void bcm2835_spi_handle_err(struct spi_controller *ctlr,
>  	struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr);
>
>  	/* if an error occurred and we have an active dma, then terminate */
> -	if (cmpxchg(&bs->dma_pending, true, false)) {
> -		dmaengine_terminate_sync(ctlr->dma_tx);
> -		dmaengine_terminate_sync(ctlr->dma_rx);
> -		bcm2835_spi_undo_prologue(bs);
> -	}
> +	dmaengine_terminate_sync(ctlr->dma_tx);
> +	dmaengine_terminate_sync(ctlr->dma_rx);

... because the ctrl->dma_tx and ->dma_rx are NULL.

> +	bcm2835_spi_undo_prologue(bs);
> +
>  	/* and reset */
>  	bcm2835_spi_reset_hw(ctlr);
>  }

The question is: Why runs the IRQ based transfer into a timeout? The
kernel that produces the crash has ecfbd3cf3b8b ("spi: bcm2835: Enable
shared interrupt support") applied (which was reverted on mainline in a
later patch).

I'll create a patch to fix the NULL pointer deref. As a interrupt based
transfer might run into a timeout for other reasons, too. So better
avoid a kernel crash in that case.

regards,
Marc

--
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Vertretung West/Dortmund         | Phone: +49-231-2826-924     |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux