Re: SPI regression with today's build

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Small follow-up on this, and CCing Noralf as I forgot to add him on the last
e-mail.

On Wed, 2019-05-08 at 17:01 +0200, Nicolas Saenz Julienne wrote:
> Hi, This has been seen on a Raspberry Pi 3B+, using a w5500 Ethernet SPI
> interface on the main SPI controller. The commit that introduced the issue
> seems to be:
> 
> c9ba7a16d0f1d2b1e70d47296eca5f612c753825 "spi: Release spi_res after
> finalizing message"
> 
> Here's the kernel log:
> 
> [  170.085704] Unable to handle kernel paging request at virtual address
> 0000000000100000
> [  170.097773] Mem abort info:
> [  170.101873]   ESR = 0x86000004
> [  170.105222]   Exception class = IABT (current EL), IL = 32 bits
> [  170.111599]   SET = 0, FnV = 0
> [  170.115040]   EA = 0, S1PTW = 0
> [  170.118484] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000ed70a017
> [  170.125511] [0000000000100000] pgd=0000000000000000
> [  170.130872] Internal error: Oops: 86000004 [#1] PREEMPT SMP
> [  170.136532] Modules linked in: fuse vc4 cec drm_kms_helper cfbfillrect
> brcmfmac cfbimgblt panel_raspberrypi_touchscreen cfbcopyarea drm hci_uart
> btbcm brcmutil bluetooth cfg80211 w5100_spi bcm2835_v4l2(C) w5100 v4l2_common
> videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
> microchip lan78xx crct10dif_ce drm_panel_orientation_quirks i2c_gpio
> ecdh_generic edt_ft5x06 raspberrypi_hwmon ecc media rfkill snd_bcm2835(C)
> input_polldev bcm2835_thermal spi_bcm2835 bcm2835_dma pwm_bcm2835 bcm2835_rng
> rng_core virt_dma ip_tables x_tables ipv6
> [  170.186506] Process spi0 (pid: 1335, stack limit = 0x00000000fb9981f5)
> [  170.193133] CPU: 0 PID: 1335 Comm: spi0 Tainted: G        WC        5.1.0-
> nico+ #116
> [  170.200985] Hardware name: Raspberry Pi 3 Model B Plus Rev 1.3 (DT)
> [  170.207343] pstate: 60000005 (nZCv daif -PAN -UAO)
> [  170.212205] pc : 0x100000
> [  170.214864] lr : spi_res_release.part.9+0x7c/0xb0
> [  170.219631] sp : ffff000011303ce0
> [  170.222987] x29: ffff000011303ce0 x28: ffff00001122bc30 
> [  170.228373] x27: 00000000cccccccd x26: ffff000010ee96c8 
> [  170.233759] x25: dead000000000100 x24: dead000000000200 
> [  170.239143] x23: ffff000010d28b78 x22: ffff80002fa60e80 
> [  170.244528] x21: ffff80002567b000 x20: ffff00001122bc88 
> [  170.249914] x19: ffff00001122bc30 x18: 0000000000000020 
> [  170.255297] x17: 0000000000000000 x16: 0000000000000000 
> [  170.260682] x15: ffffffffffffffff x14: ffff000010ee96c8 
> [  170.266068] x13: ffff000010fd3388 x12: ffff000010fd2fb7 
> [  170.271453] x11: 0000000000000000 x10: ffff000011303ce0 
> [  170.276840] x9 : ffff000011303ce0 x8 : 000000000000000f 
> [  170.282226] x7 : 0000000000000005 x6 : ffff000010fd264d 
> [  170.287612] x5 : 0000000000000000 x4 : 0000000000000000 
> [  170.292998] x3 : 0000000000100000 x2 : ffff80002fa60e98 
> [  170.298384] x1 : ffff00001122bc30 x0 : ffff80002567b000 
> [  170.303769] Call trace:
> [  170.306247]  0x100000
> [  170.308550]  spi_transfer_one_message+0x268/0x388
> [  170.313321]  __spi_pump_messages+0x200/0x508
> [  170.317651]  spi_pump_messages+0x14/0x20
> [  170.321630]  kthread_worker_fn+0xa8/0x188
> [  170.325694]  kthread+0x120/0x128
> [  170.328966]  ret_from_fork+0x10/0x18
> [  170.332595] Code: bad PC value
> [  170.335694] ---[ end trace d054c016d94fa755 ]---

It seems the SPI controller thread is racing with the device's thread.
Something like this:

     SPI Controller Thread                       SPI Device Thread

					    -> spi_sync_transfer() creates
					       spi_message on stack then
					       sleeps until finished

			[SPI transfer happens...]

-> spi_finalize_current_message()
   which wakes up SPI Device Thread

                                            -> spi_sync_transfer() returns, the
					       message disapears from the stack

-> spi_res_release() there is no more
   spi_message and the memory is
   potentialy used for something else

I've been looking at the spi_split_transfers_maxsize() code and can't think of
a reason why spi_res_release() couldn't be placed before
spi_finalize_current_message(). Which would solve the issue, but I guess Noralf
has a better perspective on the topic.

Regards,
Nicolas

Attachment: signature.asc
Description: This is a digitally signed message part


[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux