> On 14.01.2022 13:43:42, Markus Mirevik wrote: > > > > I have performance problems with mcp2518fd. I have custom board > > > > based of beagleboard black. (Sitara am335x) using two mcp2518fd. > > > > The can communication uses a lot of CPU. irq/64-spi1.1 uses around > > > > ~20% at > > > > 1000 can messages/s. > > > > > > > > I have several questions but I'll start with the most confusing. I > > > > have found that every can messages fires two interrupts on my board. > > > > > > Do you mean every RX'ed CAN frame? > > Yes, every received CAN frame. > > > > > Which interrupts does increase twice? > > > > This one: > > 64: 4 44e07000.gpio 22 Level spi1.1 > > Ok, that's the IRQ line from the mcp251xfd. ACK. > > > > > I have tested this in several ways. If I send one normal can > > > > message the counter in /proc/interrupts is increased twice. I have > > > > also put some printouts in mcp251xfd-core.c that confirms that > > > > mcp251xfd_irq() is run twice and the second time intf_pending is 0. > > > > > > You mean mcp251xfd_irq() is started twice? > > > > Yes. > > That's not the way it's supposed to be. Maybe it's a limitation of the am335x > IRQ controller. What kind of limitation could that be? > > > > The big loop > > > (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/m > > > cp251xf > > > d/mcp251xfd-core.c#L2182) > > > in mcp251xfd_irq() is run twice, and the IRQ handler is left only if > > > there are no pending IRQs. > > > > > > The main IRQ handler loop (omitting the rx-int) is straight forward, > > > and not > > > optimized: > > > - read pending IRQs [*] > > > - exit if there are no pending IRQs > > > - handle pending IRQs > > > for RX: > > > - read RX-FIFO level [*] > > > - read RX'ed CAN frames [*] > > > - mark RX'ed CAN frames as read [*] > > > - loop > > > > > > All actions marked with a [*] require a SPI transfer, which results > > > in 5 SPI transfers (read pending IRQs is done twice) per RX'ed CAN frame. > > > (Assuming there is only 1 RX'ed frame pending and no pending IRQs > > > after the first loop.) > > > > > > I have some ideas how to optimize this: > > > - do a single SPI transfer instead of several small ones: > > > e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as > > > read > > > - opportunistically assume RX'ed frame on interrupt and get rid of 1st > > > read pending IRQs > > > - assume TX done IRQ, if CAN frames have been TX'ed but not marked as > > > sent hardware > > > - coalesce RX IRQs > > > - coalesce TX done IRQs > > > - combine several TX frames into single SPI transfer > > > > > > > Indeed there is a lot to work on, I'm a bit out of my depth here and > > as I said the first question is about why it interrupt is fired twice. > > > > It's not only the big loop that is run twice. With rx-int active what > > I can observe is this: > > > > - read RX-FIFO level [*] > > - read RX'ed CAN frames [*] > > - mark RX'ed CAN frames as read [*] > > - read pending IRQs [*] > > - exit if there are no pending IRQs > > -> No Pending IRQ's Exiting. > > So far that look correct, but the extra call to mcp251xfd_irq() that follow is > not correct. > > > - read pending IRQs [*] > > - exit if there are no pending IRQs > > -> No Pending IRQ's Exiting. > > > > > And for testing purposes I changed the interrupt config to trigger > > > > on falling edge instead of level and with this configured there is > > > > only one interrupt fired. But I guess this will risk locking the > > > > interrupt line low if an edge isn't detected. > > > > > > ACK - Please don't use edge triggered IRQs, sooner or later the > > > driver will miss an IRQ resulting in a handing driver. Always use > > > level triggered with the mcp251xfd. > > > > > > > Yes, I figured that. > > > > > > I have tried both with rx-int active and inactive. My scope shows > > > > really nice signals and that rx-int and int is deactivated > > > > simultaneous on incoming frames. > > > > > > rx-int is an optimization to skip the first get IRQ pending > > > transfer, as it indicates RX'ed CAN frames. > > > > OFFTOPIC: > > Shouldn’t a return after the rx-int part be ok to skip the next get IRQ > pending transfer as well? > > That's a possible optimization. > > I would look at the number of pending TX CAN frames. The higher the > number, the higher the chance that some have been send. > > > if (priv->rx_int) { > > do { > > int rx_pending; > > rx_pending = gpiod_get_value_cansleep(priv->rx_int); > > > > if (!rx_pending) > > break; > > > > err = mcp251xfd_handle(priv, rxif); > > if (err) > > goto out_fail; > > > > handled = IRQ_HANDLED; > > } while (1); > > > > if (handled == IRQ_HANDLED) { > > return handled; > > } > > } > > ... > > > > Even better would be to check if (IRQ.VALUE == INACTIVE) but I don > > know how to that.. > > I think this is not possible in the Linux kernel anymore. There was a function > to get the GPIO associated with an IRQ, but that's not available anymore. In > general IRQs are not associated to GPIO, that's probably only the case for > SoCs, but not for other busses like PCI. > OK. > [...] > > > > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. > > > Please instrument the beginning and the returns of the > > > mcp251xfd_irq() function to check if it's really started twice. > > > Please print the > > > priv->regs_status.intf in every loop. Can you record the gpio2_5 > > > priv->with a > > > scope. > > > > Yes I can but what do you mean with Instrument? > > That's a fancy way of saying: please add printk() :) > > > And no scope until Monday. > > no problem > > > > Make sure that you don't send any CAN frames as reaction of reception. > > > > How do I make sure of that? :/ > > The Linux kernel doesn't send any CAN frames on its own. You need a user > space program to do so. Make sure you don't have any user space program > running that's sending CAN frames. > No user space program running that would send can frames. And no can frames received on the other side. With IRQ_TYPE_LEVEL_LOW and the rx-int-gpios disabled I have added a call to netdev_info at the top of the function: ... netdev_info(priv->ndev, "mcp251xfd_irq mcp251xfd_regs_status.intf: %d", priv->regs_status.intf ); if (priv->rx_int) { ... and here: ... if (!(intf_pending)) { netdev_info(priv->ndev, "!intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf ); return handled; } netdev_info(priv->ndev, "intf_pending mcp251xfd_regs_status.intf: %d", priv->regs_status.intf ); ... Resulting in this when receiving one can message: [ 987.486691] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100 [ 987.494987] mcp251xfd spi1.1 can1: intf_pending mcp251xfd_regs_status.intf: -1088815098 [ 987.503239] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100 [ 987.511423] mcp251xfd spi1.1 can1: mcp251xfd_irq mcp251xfd_regs_status.intf: -1088815100 [ 987.519603] mcp251xfd spi1.1 can1: !intf_pending mcp251xfd_regs_status.intf: -1088815100 mcp251xfd_regs_status.intf translated to binary 1 1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100 2 1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0110 3 1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100 4 1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100 5 1111 1111 1111 1111 1111 1111 1111 1111 1011 1111 0001 1010 0000 0000 0000 0100 I have also attached a scope picture, this is without the printouts. And if I'm reading it correctly it confirms that it runs twice. 1 read pending IRQs [*] -RXIF- 2 read RX-FIFO level [*] 3 read RX'ed CAN frames [*] 4 and 5 mark RX'ed CAN frames as read [*] (This is the 5.10.79 version that uses regmap_update_bits which I assume uses one read and one write instruction.) -loop- 6 read pending IRQs [*] -Exit- 7 read pending IRQs [*] Confirmd with printouts: [ 26.990324] mcp251xfd spi1.1 can1: mcp251xfd_irq() [ 26.995173] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT [ 27.001371] mcp251xfd spi1.1 can1: Interrupts pending, handling. [ 27.007435] mcp251xfd spi1.1 can1: * Read RX MCP251XFD_REG_FIFOSTA [ 27.013676] mcp251xfd spi1.1 can1: * Bulk read. Register: 1248 count: 5 [ 27.020226] mcp251xfd spi1.1 can1: * Update bits MCP251XFD_REG_FIFOCON_UINC [ 27.027318] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT [ 27.033378] mcp251xfd spi1.1 can1: No interrupts pending, returning. [ 27.039805] mcp251xfd spi1.1 can1: mcp251xfd_irq() [ 27.044617] mcp251xfd spi1.1 can1: * Bulk read MCP251XFD_REG_INT [ 27.050693] mcp251xfd spi1.1 can1: No interrupts pending, returning. Regards Markus
Attachment:
scope.png
Description: scope.png