Re: SPI regression with today's build

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[cc:Martin]

Den 08.05.2019 19.07, skrev Nicolas Saenz Julienne:
> Small follow-up on this, and CCing Noralf as I forgot to add him on the last
> e-mail.
> 
> On Wed, 2019-05-08 at 17:01 +0200, Nicolas Saenz Julienne wrote:
>> Hi, This has been seen on a Raspberry Pi 3B+, using a w5500 Ethernet SPI
>> interface on the main SPI controller. The commit that introduced the issue
>> seems to be:
>>
>> c9ba7a16d0f1d2b1e70d47296eca5f612c753825 "spi: Release spi_res after
>> finalizing message"
>>
>> Here's the kernel log:
>>
>> [  170.085704] Unable to handle kernel paging request at virtual address
>> 0000000000100000
>> [  170.097773] Mem abort info:
>> [  170.101873]   ESR = 0x86000004
>> [  170.105222]   Exception class = IABT (current EL), IL = 32 bits
>> [  170.111599]   SET = 0, FnV = 0
>> [  170.115040]   EA = 0, S1PTW = 0
>> [  170.118484] user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000ed70a017
>> [  170.125511] [0000000000100000] pgd=0000000000000000
>> [  170.130872] Internal error: Oops: 86000004 [#1] PREEMPT SMP
>> [  170.136532] Modules linked in: fuse vc4 cec drm_kms_helper cfbfillrect
>> brcmfmac cfbimgblt panel_raspberrypi_touchscreen cfbcopyarea drm hci_uart
>> btbcm brcmutil bluetooth cfg80211 w5100_spi bcm2835_v4l2(C) w5100 v4l2_common
>> videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev
>> microchip lan78xx crct10dif_ce drm_panel_orientation_quirks i2c_gpio
>> ecdh_generic edt_ft5x06 raspberrypi_hwmon ecc media rfkill snd_bcm2835(C)
>> input_polldev bcm2835_thermal spi_bcm2835 bcm2835_dma pwm_bcm2835 bcm2835_rng
>> rng_core virt_dma ip_tables x_tables ipv6
>> [  170.186506] Process spi0 (pid: 1335, stack limit = 0x00000000fb9981f5)
>> [  170.193133] CPU: 0 PID: 1335 Comm: spi0 Tainted: G        WC        5.1.0-
>> nico+ #116
>> [  170.200985] Hardware name: Raspberry Pi 3 Model B Plus Rev 1.3 (DT)
>> [  170.207343] pstate: 60000005 (nZCv daif -PAN -UAO)
>> [  170.212205] pc : 0x100000
>> [  170.214864] lr : spi_res_release.part.9+0x7c/0xb0
>> [  170.219631] sp : ffff000011303ce0
>> [  170.222987] x29: ffff000011303ce0 x28: ffff00001122bc30 
>> [  170.228373] x27: 00000000cccccccd x26: ffff000010ee96c8 
>> [  170.233759] x25: dead000000000100 x24: dead000000000200 
>> [  170.239143] x23: ffff000010d28b78 x22: ffff80002fa60e80 
>> [  170.244528] x21: ffff80002567b000 x20: ffff00001122bc88 
>> [  170.249914] x19: ffff00001122bc30 x18: 0000000000000020 
>> [  170.255297] x17: 0000000000000000 x16: 0000000000000000 
>> [  170.260682] x15: ffffffffffffffff x14: ffff000010ee96c8 
>> [  170.266068] x13: ffff000010fd3388 x12: ffff000010fd2fb7 
>> [  170.271453] x11: 0000000000000000 x10: ffff000011303ce0 
>> [  170.276840] x9 : ffff000011303ce0 x8 : 000000000000000f 
>> [  170.282226] x7 : 0000000000000005 x6 : ffff000010fd264d 
>> [  170.287612] x5 : 0000000000000000 x4 : 0000000000000000 
>> [  170.292998] x3 : 0000000000100000 x2 : ffff80002fa60e98 
>> [  170.298384] x1 : ffff00001122bc30 x0 : ffff80002567b000 
>> [  170.303769] Call trace:
>> [  170.306247]  0x100000
>> [  170.308550]  spi_transfer_one_message+0x268/0x388
>> [  170.313321]  __spi_pump_messages+0x200/0x508
>> [  170.317651]  spi_pump_messages+0x14/0x20
>> [  170.321630]  kthread_worker_fn+0xa8/0x188
>> [  170.325694]  kthread+0x120/0x128
>> [  170.328966]  ret_from_fork+0x10/0x18
>> [  170.332595] Code: bad PC value
>> [  170.335694] ---[ end trace d054c016d94fa755 ]---
> 
> It seems the SPI controller thread is racing with the device's thread.
> Something like this:
> 
>      SPI Controller Thread                       SPI Device Thread
> 
> 					    -> spi_sync_transfer() creates
> 					       spi_message on stack then
> 					       sleeps until finished
> 
> 			[SPI transfer happens...]
> 
> -> spi_finalize_current_message()
>    which wakes up SPI Device Thread
> 
>                                             -> spi_sync_transfer() returns, the
> 					       message disapears from the stack
> 
> -> spi_res_release() there is no more
>    spi_message and the memory is
>    potentialy used for something else
> 
> I've been looking at the spi_split_transfers_maxsize() code and can't think of
> a reason why spi_res_release() couldn't be placed before
> spi_finalize_current_message(). Which would solve the issue, but I guess Noralf
> has a better perspective on the topic.
> 

The problem was that spi_res_release() restored the original transfers
before spi_unmap_msg() is called in spi_finalize_current_message() thus
dma unmapping the original transfers, not the split ones that was mapped.

This is the accompanying change to the driver that hasn't been applied:
[v5,3/4] spi/spi-bcm2835: Split transfers that exceed DLEN
https://patchwork.kernel.org/patch/10899587/

Unless Martin Sperl, who wrote spi_split_transfers_maxsize(), has other
suggestions, I think we should just revert this patch.

Apparently I need to go back to the drawing board with this.

Noralf.

> Regards,
> Nicolas
> 



[Index of Archives]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux