AW: IRQ handler mcp251xfd_handle_tefif() returned -22

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Marc,

just a short update on this. We are using the Raspberry Pi Kernel.
So I tested with Kernel 6.6.45 which is:
https://github.com/raspberrypi/linux/tree/209e8a3e6646f25abb352fd5a8a4c2e855b1e952
and over there, there is no problem (IRQ handler mcp251xfd_handle_tefif() returned -22).

My Initial finding was in Kernel 6.6.47:
https://github.com/raspberrypi/linux/tree/8beb6891489c3c99618a7390578109aadfdf8901
So it seems that one of these introduced the problem:
https://github.com/raspberrypi/linux/commit/1333fd55d12edf973b72010c63bfe6b334c76b49
https://github.com/raspberrypi/linux/commit/759822a3300cff86d8ea5391173dd557b2d1c7e3

Now I tried to add the debug stuff you suggested and for that I used todays kernel which is 6.6.51
https://github.com/raspberrypi/linux/tree/0fb3c83a9fa3011cb735ec011b7582d4749957b2

But with this one I get:
[ 2028.554168] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 2028.569440] Mem abort info:
[ 2028.572479]   ESR = 0x0000000096000005
[ 2028.576432]   EC = 0x25: DABT (current EL), IL = 32 bits
[ 2028.583165]   SET = 0, FnV = 0
[ 2028.586269]   EA = 0, S1PTW = 0
[ 2028.589540]   FSC = 0x05: level 1 translation fault
[ 2028.596251] Data abort info:
[ 2028.599356]   ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
[ 2028.605216]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[ 2028.610617]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[ 2028.616241] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000e8e25000
[ 2028.627701] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 2028.638875] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
[ 2028.645160] Modules linked in: can_raw can xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_a                      ddrtype nft_compat br_netfilter bridge stp llc nft_reject_ipv4 nf_reject_ipv4 nft_reject nft_ct nft_masq nft_chain_nat nf_tables nfnetlink n                      f_nat_h323 nf_conntrack_h323 nf_nat_pptp nf_conntrack_pptp nf_nat_tftp nf_conntrack_tftp nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack                      _irc nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 overlay ina2xx_adc kfifo_buf industrialio leds_lp50xx mcp                      251xfd ina2xx led_class_multicolor can_dev rtc_bq32k regmap_i2c dp83tc811 brcmfmac_wcc brcmfmac brcmutil cfg80211 binfmt_misc rfkill bcm2835                      _v4l2(C) bcm2835_isp(C) bcm2835_codec(C) rpivid_hevc(C) bcm2835_mmal_vchiq(C) dwc2 i2c_mux_pinctrl v4l2_mem2mem snd_bcm2835(C) videobuf2_vma                      lloc snd_pcm videobuf2_dma_contig videobuf2_memops snd_timer videobuf2_v4l2 raspberrypi_hwmon videodev i2c_mux snd videobuf2_common vc_sm_cm                      a(C) mc spi_bcm2835 i2c_bcm2835 raspberrypi_gpiomem
[ 2028.645357]  spi_bcm2835aux gpio_keys nvmem_rmem uio_pdrv_genirq uio drm fuse drm_panel_orientation_quirks dm_mod backlight ip_tables x_t                      ables ipv6
[ 2028.748326] CPU: 1 PID: 2977 Comm: ip Tainted: G         C         6.6.51-genipi-v8+ #1
[ 2028.756329] Hardware name: Raspberry Pi Compute Module 4 Rev 1.1 (DT)
[ 2028.762763] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2028.769721] pc : timecounter_read+0x20/0x80
[ 2028.773905] lr : mcp251xfd_ring_init+0x1a0/0x500 [mcp251xfd]
[ 2028.779575] sp : ffffffc08195b420
[ 2028.782881] x29: ffffffc08195b420 x28: ffffff8064e6a000 x27: 0000000000000001
[ 2028.790018] x26: 0000000000000000 x25: ffffff8064e6a040 x24: ffffff80420d2940
[ 2028.797154] x23: 0000000000000000 x22: 0000000000000430 x21: 0000000000000001
[ 2028.804289] x20: ffffff80420d0940 x19: ffffff80420d38b0 x18: ffffffc081453d78
[ 2028.811425] x17: 0000000000000000 x16: ffffffda4b153b58 x15: 0000007f60020fff
[ 2028.818560] x14: 0000000000000001 x13: 0000000000000000 x12: 0000000000000000
[ 2028.825694] x11: 000000000000010a x10: 0000000000000001 x9 : ffffffd9d002ea30
[ 2028.832829] x8 : 0000000000000001 x7 : 0000000000000000 x6 : 000000000000000a
[ 2028.839963] x5 : 0000000000000001 x4 : ffffffd9d00329c8 x3 : 0000000000000000
[ 2028.847098] x2 : 000000000000005c x1 : 0000000000002f70 x0 : 0000000000000000
[ 2028.854233] Call trace:
[ 2028.856673]  timecounter_read+0x20/0x80
[ 2028.860506]  mcp251xfd_ring_init+0x1a0/0x500 [mcp251xfd]
[ 2028.865823]  mcp251xfd_chip_start+0x234/0x2a0 [mcp251xfd]
[ 2028.871224]  mcp251xfd_open+0x94/0x2a8 [mcp251xfd]
[ 2028.876016]  __dev_open+0x120/0x218
[ 2028.879502]  __dev_change_flags+0x194/0x218
[ 2028.883680]  dev_change_flags+0x2c/0x80
[ 2028.887511]  do_setlink+0x28c/0xef8
...

Which as far as I understood because these two are not in:
51b2a7216122 ("can: mcp251xfd: properly indent labels")
a7801540f325 ("can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop()")

I tried to apply them,  but actually a7801540f325 does not apply out of the box.

I think I will do the tests you suggested in 6.6.47 and come back to you....

Thanks, Sven

________________________________________
Von: Marc Kleine-Budde
Gesendet: Mittwoch, 25. September 2024 16:33
Bis: Sven Schuchmann
Cc: linux-can@xxxxxxxxxxxxxxx
Betreff: Re: IRQ handler mcp251xfd_handle_tefif() returned -22


On 25.09.2024 07:38:12, Sven Schuchmann wrote:

> I am using Kernel 6.6.47 and sometimes I see this in kernel logs:

>

> [  355.728634] mcp251xfd spi0.0 canfd0: IRQ handler mcp251xfd_handle_tefif() returned -22.

> [  355.728672] mcp251xfd spi0.0 canfd0: IRQ handler returned -22 (intf=0xbf1a0016).

>

> After that the complete CAN is down.



Yes, the interface is shut down intentionally in case of errors.



> ifconfig canfd0 down and up fixes the problem.



That's intentional, too :)



> We are using two CANs (both mcp251xfd) at the same time in canfd mode.

> We are sending about 9 Frames each 10ms on  both CANs (bus load of about 35% per CAN).

>

> Top shows about 10% of CPU Load on the SPIs:

>

>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND

>    5620 root     -51   0       0      0      0 S  11.9   0.0   0:45.33 irq/45-spi0.0

>

> Anyone an idea on this?



Can you add "dev_err(&spi->dev, ... );" and print interesting things in

mcp251xfd-regmap.c where it returns -EINVAL. Maybe add an additionally

"dump_stack();"



Have you enabled CONFIG_CAN_MCP251XFD_SANITY? If not, please do.

Please also add "#define DEBUG" in mcp251xfd-tef.c before all "#includes".



regards,

Marc



--

Pengutronix e.K.                 | Marc Kleine-Budde          |

Embedded Linux                   | https://www.pengutronix.de ;|

Vertretung Nürnberg              | Phone: +49-5121-206917-129 |

Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-9   |






[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux