On 26.09.2024 18:52:19, Sven Schuchmann wrote: > my observations so far: > The EINVAL is returned by > err = mcp251xfd_tef_obj_read(priv, hw_tef_obj, tef_tail, l); > inside mcp251xfd_handle_tefif() > > I modified mcp251xfd_tef_obj_read() like so: > > err = regmap_bulk_read(priv->map_rx, > mcp251xfd_get_tef_obj_addr(offset), > hw_tef_obj, > sizeof(*hw_tef_obj) / val_bytes * len); > if (err) { > dump_stack(); As you already have located the call with "len = 0", we don't need that dump_stack() anymore. > netdev_err(priv->ndev, > "Offset=%d, sizeof(*hw_tef_obj)=%ld, val_bytes=%d, len=%d).\n", offset, sizeof(*hw_tef_obj), val_bytes, len); > } > > and now I get: > [ 45.672211] CPU: 0 PID: 1643 Comm: irq/47-spi1.0 Tainted: G C 6.6.51-pi-v8+ #3 > [ 45.672240] Hardware name: Raspberry Pi Compute Module 4 Rev 1.1 (DT) > [ 45.672247] Call trace: > [ 45.672254] dump_backtrace+0xa0/0x100 > [ 45.672274] show_stack+0x20/0x38 > [ 45.672284] dump_stack_lvl+0x48/0x60 > [ 45.672300] dump_stack+0x18/0x28 > [ 45.672313] mcp251xfd_handle_tefif+0x360/0x538 [mcp251xfd] > [ 45.672349] mcp251xfd_irq+0x410/0xda0 [mcp251xfd] > [ 45.672373] irq_thread_fn+0x34/0xb8 > [ 45.672382] irq_thread+0x174/0x260 > [ 45.672393] kthread+0x11c/0x128 > [ 45.672407] ret_from_fork+0x10/0x20 > [ 45.672426] mcp251xfd spi1.0 canfd1: Offset=3, sizeof(*hw_tef_obj)=12, val_bytes=4, len=0). > [ 45.672450] mcp251xfd spi1.0 canfd1: IRQ handler mcp251xfd_handle_tefif() returned -22. > [ 45.672459] mcp251xfd spi1.0 canfd1: IRQ handler returned -22 (intf=0xbf1a0010). > > len=0 looks strange to me here. Yes, the regmap_*() functions return an error if called with "len = 0". > This zero len is coming from inside mcp251xfd_handle_tefif() > err = mcp251xfd_get_tef_len(priv, &len); > > I also modified this one: > len = (chip_tx_tail << shift) - (tail << shift); > *len_p = len >> shift; > > if (*len_p == 0) { > netdev_err(priv->ndev, "len=%d, chip_tx_tail=%d, tail=%d, shift=%d\n", len, chip_tx_tail, tail, shift); > } > > and I get this: > [ 54.645392] mcp251xfd spi1.0 canfd1: len=0, chip_tx_tail=1, tail=1, shift=6 > > But I am not sure if the len=0 is really the problem..? Yes, "len = 0" is the problem here. So far the driver assumes that if there is a TEF interrupt there must be events in the TE-FIFO, i.e. the length must be != 0. > > I've already send a mail to stable to include these in the next > > stable release. > > Perfect! Meanwhile the patches are queued for the next stable releases: | https://lore.kernel.org/all/2024092734-tackle-outlying-ae73@gregkh regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung Nürnberg | Phone: +49-5121-206917-129 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-9 |
Attachment:
signature.asc
Description: PGP signature