Hi > On 14.01.2022 11:05:51, Markus Mirevik wrote: > > I have performance problems with mcp2518fd. I have custom board based > > of beagleboard black. (Sitara am335x) using two mcp2518fd. The can > > communication uses a lot of CPU. irq/64-spi1.1 uses around ~20% at > > 1000 can messages/s. > > > > I have several questions but I'll start with the most confusing. I > > have found that every can messages fires two interrupts on my board. > > Do you mean every RX'ed CAN frame? Yes, every received CAN frame. > Which interrupts does increase twice? This one: 64: 4 44e07000.gpio 22 Level spi1.1 > > I have tested this in several ways. If I send one normal can message > > the counter in /proc/interrupts is increased twice. I have also put > > some printouts in mcp251xfd-core.c that confirms that mcp251xfd_irq() > > is run twice and the second time intf_pending is 0. > > You mean mcp251xfd_irq() is started twice? Yes. > > The big loop > (https://elixir.bootlin.com/linux/v5.16/source/drivers/net/can/spi/mcp251xf > d/mcp251xfd-core.c#L2182) > in mcp251xfd_irq() is run twice, and the IRQ handler is left only if there are > no pending IRQs. > > The main IRQ handler loop (omitting the rx-int) is straight forward, and not > optimized: > - read pending IRQs [*] > - exit if there are no pending IRQs > - handle pending IRQs > for RX: > - read RX-FIFO level [*] > - read RX'ed CAN frames [*] > - mark RX'ed CAN frames as read [*] > - loop > > All actions marked with a [*] require a SPI transfer, which results in 5 SPI > transfers (read pending IRQs is done twice) per RX'ed CAN frame. > (Assuming there is only 1 RX'ed frame pending and no pending IRQs after > the first loop.) > > I have some ideas how to optimize this: > - do a single SPI transfer instead of several small ones: > e.g. pending IRQs + RX-FIFO level, or read RX'ed frames + mark as read > - opportunistically assume RX'ed frame on interrupt and get rid of 1st > read pending IRQs > - assume TX done IRQ, if CAN frames have been TX'ed but not marked as > sent hardware > - coalesce RX IRQs > - coalesce TX done IRQs > - combine several TX frames into single SPI transfer > Indeed there is a lot to work on, I'm a bit out of my depth here and as I said the first question is about why it interrupt is fired twice. It's not only the big loop that is run twice. With rx-int active what I can observe is this: - read RX-FIFO level [*] - read RX'ed CAN frames [*] - mark RX'ed CAN frames as read [*] - read pending IRQs [*] - exit if there are no pending IRQs -> No Pending IRQ's Exiting. - read pending IRQs [*] - exit if there are no pending IRQs -> No Pending IRQ's Exiting. > > And for testing purposes I changed the interrupt config to trigger on > > falling edge instead of level and with this configured there is only > > one interrupt fired. But I guess this will risk locking the interrupt > > line low if an edge isn't detected. > > ACK - Please don't use edge triggered IRQs, sooner or later the driver will > miss an IRQ resulting in a handing driver. Always use level triggered with the > mcp251xfd. > Yes, I figured that. > > I have tried both with rx-int active and inactive. My scope shows > > really nice signals and that rx-int and int is deactivated > > simultaneous on incoming frames. > > rx-int is an optimization to skip the first get IRQ pending transfer, as it > indicates RX'ed CAN frames. OFFTOPIC: Shouldn’t a return after the rx-int part be ok to skip the next get IRQ pending transfer as well? if (priv->rx_int) { do { int rx_pending; rx_pending = gpiod_get_value_cansleep(priv->rx_int); if (!rx_pending) break; err = mcp251xfd_handle(priv, rxif); if (err) goto out_fail; handled = IRQ_HANDLED; } while (1); if (handled == IRQ_HANDLED) { return handled; } } ... Even better would be to check if (IRQ.VALUE == INACTIVE) but I don know how to that.. > > > I'm testing by sending messages from my computer through a can dongle. > > > > The board is currently running Linux 5.10.79-yocto-tiny #1 SMP Tue Nov > > 16 03:57:43 UTC 2021 armv7l armv7l armv7l GNU/Linux > > Newer kernels include some optimizations.... > > > I've also tested updating the driver to the one from > > https://github.com/marckleinebudde/linux.git from 2021-12-29 with the > > same result (IRQ handler run twice). > > The newest production ready code is always net-next/master. But you can > use the latest official kernel release (v5.16) too. > > > I am profoundly confused by this issue and I can not figure out why > > it's happening. My understanding is that since the IRQ handler is > > loaded with IRQF_ONESHOT the irq line should be masked until > > mcp251xfd_irq() returns. > > ACK. > > > Since mcp251xfd_irq() checks that !rx_pending before exiting the int > > signal must be high. And the interrupt should not fire again... > > The rx-int GPIO: > > | microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>; > > is not used as an interrupt. The only interrupt line is: > > | interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>; > > If there is an interrupt the gpio2_5 will get active, if this is an RX IRQ the > gpio1_8 will get active, too. But only gpio2_5 will trigger the interrupt > handler. > > In the interrupt handler, as rx-int is available and active the RX handling is > done first. If there are no more RX IRQs pending, the rx-int goes inactive and > the big loop in the IRQ handler will be handled: read rx-pending IRQs, > probably none pending, then leave IRQ handler. > Ok. > > Result from init: > > [ 4.003029] mcp251xfd spi1.0 can0: MCP2518FD rev0.0 (+RX_INT - > MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz > m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized. > > [ 4.028220] mcp251xfd spi1.1 can1: MCP2518FD rev0.0 (-RX_INT - > MAB_NO_WARN +CRC_REG +CRC_RX +CRC_TX +ECC -HD c:40.00MHz > m:20.00MHz r:17.00MHz e:0.00MHz) successfully initialized. > > > > This is the current dtsi part for the canFD chips. (With rx-pin removed on > one of the devices and trigger on edge on the other for debugging > purposes). > > > > #include <dt-bindings/gpio/gpio.h> > > #include <dt-bindings/interrupt-controller/irq.h> > > > > &am33xx_pinmux{ > > pinctrl_spi1_pins: pinctrl_spi1_pins { > > pinctrl-single,pins = < > > AM33XX_IOPAD(0x990, PIN_INPUT | MUX_MODE3) /* (A13) > mcasp0_aclkx.spi1_sclk */ > > AM33XX_IOPAD(0x994, PIN_INPUT | MUX_MODE3) /* (B13) > mcasp0_fsx.spi1_d0 */ > > AM33XX_IOPAD(0x998, PIN_INPUT | MUX_MODE3) /* (D12) > mcasp0_axr0.spi1_d1 */ > > AM33XX_IOPAD(0x96c, PIN_OUTPUT_PULLUP | MUX_MODE5) > /* (E17) uart0_rtsn.spi1_cs0 CleANopen LEFT*/ > > AM33XX_IOPAD(0x9b0, PIN_OUTPUT_PULLUP | > MUX_MODE4) /* (A15) xdma_event_intr0.spi1_cs1 SAWM CAN > RIGHT*/ > > >; > > }; > > > > can0_int_pins: can0_int_pins { > > pinctrl-single,pins = < > > /*CleANopen*/ > > AM33XX_IOPAD(0x89c, PIN_INPUT_PULLUP | MUX_MODE7) /* > (T6) gpmc_be0n_cle.gpio2[5] nINT */ > > AM33XX_IOPAD(0x968, PIN_INPUT_PULLUP | MUX_MODE7) /* > (E18) uart0_ctsn.gpio1[8] nINT1 */ > > >; > > }; > > > > can1_int_pins: can1_int_pins { > > pinctrl-single,pins = < > > /*SAWM CAN*/ > > AM33XX_IOPAD(0x820, PIN_INPUT_PULLUP | MUX_MODE7) /* > (U10) gpmc_ad8.gpio0[22] nINT */ > > AM33XX_IOPAD(0x8c8, PIN_INPUT_PULLUP | MUX_MODE7) /* > (U3) lcd_data10.gpio2[16] nINT1 */ > > AM33XX_IOPAD(0x8cc, PIN_INPUT_PULLUP | MUX_MODE7) /* > (U4) lcd_data11.gpio2[17] nINT0 NOT USED */ > > >; > > }; > > }; > > > > > > > > /{ > > /* external 40M oscillator of mcp2518fd on SPI1.0 */ > > mcp2518fd_can0_osc: mcp2518fd_can0_osc { > > compatible = "fixed-clock"; > > #clock-cells = <0>; > > clock-frequency = <40000000>; > > }; > > }; > > > > > > /{ > > /* external 40M oscillator of mcp2518fd on SPI1.1 */ > > mcp2518fd_can1_osc: mcp2518fd_can1_osc { > > compatible = "fixed-clock"; > > #clock-cells = <0>; > > clock-frequency = <40000000>; > > }; > > }; > > > > /* the spi config of the can-controller itself binding everything > > together */ &spi1{ > > #address-cells = <1>; > > #size-cells = <0>; > > > > status = "okay"; > > pinctrl-names = "default"; > > pinctrl-0 = <&pinctrl_spi1_pins>; > > > > /*CleANopen*/ > > can@0 { > > compatible = "microchip,mcp2518fd"; > > reg = <0>; > > clocks = <&mcp2518fd_can0_osc>; > > pinctrl-names = "default"; > > pinctrl-0 = <&can0_int_pins>; > > spi-max-frequency = <20000000>; > > interrupts-extended = <&gpio2 5 IRQ_TYPE_EDGE_FALLING>; > > See comment about edge IRQs above! ACK. > > > microchip,rx-int-gpios = <&gpio1 8 GPIO_ACTIVE_LOW>; > > }; > > > > can@1 { > > compatible = "microchip,mcp2518fd"; > > reg = <1>; > > clocks = <&mcp2518fd_can1_osc>; > > pinctrl-names = "default"; > > pinctrl-0 = <&can1_int_pins>; > > spi-max-frequency = <20000000>; > > interrupts-extended = <&gpio0 22 IRQ_TYPE_LEVEL_LOW>; > > // microchip,rx-int-gpios = <&gpio2 16 GPIO_ACTIVE_LOW>; > > }; > > }; > > Can you test again with IRQ_TYPE_LEVEL_LOW and no rx-int-gpios. Please > instrument the beginning and the returns of the mcp251xfd_irq() function to > check if it's really started twice. Please print the > priv->regs_status.intf in every loop. Can you record the gpio2_5 with a > scope. Yes I can but what do you mean with Instrument? And no scope until Monday. > Make sure that you don't send any CAN frames as reaction of reception. How do I make sure of that? :/ > > regards, > Marc > > -- > Pengutronix e.K. | Marc Kleine-Budde | > Embedded Linux | https://www.pengutronix.de | > Vertretung West/Dortmund | Phone: +49-231-2826-924 | > Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 | Thank you for your time Markus