Hi Marc, Thanks for the detailed answer! On Mar. 4 Mai 2021 at 16:48, Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote: > On 04.05.2021 06:46:17, Vincent MAILHOL wrote: > > > And even on the mcp251xfd, where I receive the CAN frame, there's no way > > > to tell if this frame has been acked or not. > > The test setup is: > > flexcan (listen only) > | > | > PEAK PCAN-USB FD ---------+--------- mcp2518fd (listen only) > (sender) | > | > candlelight (going to be unplugged) > > pcan-usb: sending CAN frames > flexcan: receiving CAN frames - but controller in listen only mode > mcp2518fd: receiving CAN frames - but controller in listen only mode > candlelight: receiving CAN frames - first attached, then detached > > > The mcp251xfd behavior is interesting. Do you also receive the ACK > > error flag? > > In my tests from yesterday neither the flexcan nor the mcp2518fd had bus > error reporting enabled. So I haven't noticed any ACK errors on the > mcp2518fd nor the flexcan. > > I just repeated the test with bus error reporting enabled: > > On the flexcan I receive _only_ these errors (repeating) with > candlelight detached: > > | (2021-05-04 09:00:30.407709) can0 RX - - 20000088 [8] 00 00 08 00 00 00 00 00 ERRORFRAME > | protocol-violation{{tx-dominant-bit-error}{}} > | bus-error > > > On the mcp2518fd I see these errors: > > | (2021-05-04 09:05:00.594321) mcp251xfd0 RX - - 222 [8] 4A 00 00 00 00 00 00 00 > | (2021-05-04 09:05:01.094418) mcp251xfd0 RX - - 222 [8] 4B 00 00 00 00 00 00 00 > | (2021-05-04 09:05:01.594577) mcp251xfd0 RX - - 222 [8] 4C 00 00 00 00 00 00 00 > ...unplug candlelight here... > | (2021-05-04 09:05:02.094878) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.095589) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.096263) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.096934) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.097596) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.098261) mcp251xfd0 RX - - 20000088 [8] 00 00 02 00 00 00 00 00 ERRORFRAME > | protocol-violation{{frame-format-error}{}} > | bus-error > | (2021-05-04 09:05:02.099035) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.099054) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.099603) mcp251xfd0 RX - - 20000088 [8] 00 00 00 00 00 00 00 00 ERRORFRAME > | protocol-violation{{}{}} > | bus-error > > from here now only RX frames, no error frames I guess that above error flags are the consequence of the interferences on the bus while unplugging the candlelight. Those are probably not relevant to our specific topic. > | (2021-05-04 09:05:02.100540) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.100570) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.100583) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.100593) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > | (2021-05-04 09:05:02.101326) mcp251xfd0 RX - - 222 [8] 4D 00 00 00 00 00 00 00 > > ... and repeating. > > > Here a short dump of the mcp2518fd registers: > > | INT: intf(0x01c)=0xbf1a0806 > | IE IF IE & IF > | IVMI x Invalid Message Interrupt > | WAKI Bus Wake Up Interrupt > | CERRI x CAN Bus Error Interrupt > | SERRI x System Error Interrupt > | RXOVI x x x Receive FIFO Overflow Interrupt > | TXATI x Transmit Attempt Interrupt > | SPICRCI x SPI CRC Error Interrupt > | ECCI x ECC Error Interrupt > | TEFI x Transmit Event FIFO Interrupt > | MODI x Mode Change Interrupt > | TBCI x Time Base Counter Interrupt > | RXI x x x Receive FIFO Interrupt > | TXI Transmit FIFO Interrupt > > Note: there is no invalid message interrupt pending > > | TREC: trec(0x034)=0x00000000 > | TXBO Transmitter in Bus Off State > | TXBP Transmitter in Error Passive State > | RXBP Receiver in Error Passive State > | TXWARN Transmitter in Error Warning State > | RXWARN Receiver in Error Warning State > | EWARN Transmitter or Receiver is in Error Warning State > | TEC = 0 Transmit Error Counter > | REC = 0 Receive Error Counter > | > | BDIAG0: bdiag0(0x038)=0x00000010 > | DTERRCNT = 0 Data Bit Rate Transmit Error Counter > | DRERRCNT = 0 Data Bit Rate Receive Error Counter > | NTERRCNT = 0 Nominal Bit Rate Transmit Error Counter > | NRERRCNT = 16 Nominal Bit Rate Receive Error Counter > | > | BDIAG1: bdiag1(0x03c)=0x0000dd4b > | DLCMM DLC Mismatch > | ESI ESI flag of a received CAN FD message was set > | DCRCERR Data CRC Error > | DSTUFERR Data Bit Stuffing Error > | DFORMERR Data Format Error > | DBIT1ERR Data BIT1 Error > | DBIT0ERR Data BIT0 Error > | TXBOERR Device went to bus-off (and auto-recovered) > | NCRCERR CRC Error > | NSTUFERR Bit Stuffing Error > | NFORMERR Format Error > | NACKERR Transmitted message was not acknowledged > | NBIT1ERR Bit1 Error > | NBIT0ERR Bit0 Error > | EFMSGCNT = 56651 Error Free Message Counter > > > Does the controller retry to send the frame until it gets > > acknowledged? > > Yes - as it should. I should have been more careful when reading your previous message. I could have seen that you sent the message with an increasing payload and that as soon as the acknowledging node was removed, the same payload kept repeating again and again. In light of above information I have two remarks: First, the Peak does not generate the ACK error flag as it is expected to do. I do not know if this is a side effect of setting it to listen only. I would expect the listen only mode to only impact the reception, but maybe it has the side effect of also allowing to not generate an error if not receiving the ACK bit? Does the Peak correctly send the ACK error flag when sending in normal mode (not listen only)? Second, the receiver behaviour when receiving an non-ACKed frame is actually unspecified. As mentioned before, non-ACKed frames should be immediately followed by an ACK error flag. Here, the receiving nodes are facing a situation which should never occur. The mcp2518fd decides to register the frame as received and the flexcan decides to not register the frame. I think that both behaviors are actually fine: with the lack of specification, the implementation is free to decide how to handle this side case. In short, the real question is the first point: why didn't the Peak send the ACK error flag? > > Are you still able to send frames and receive the echo if there is a > > single node on the network? > > No - But the peak driver/hw has some limitations: > > The peak driver doesn't have TX complete signaling, it send the echo > after sending the TX CAN frame via USB. And the peak controller seems to > buffer quite a lot TX CAN frames, so it looks for the first ~72 frames > like the bus is still working. Yes, I also noticed that when I had peak devices in my test lab. The peak driver call can_put_echo_skb() inside peak_usb_ndo_start_xmit() and thus, the echo frames do not reflect whether the actual completion occured or not. I guess fixing that should not be too hard but I do not have access to that hardware anymore to do it myself. I am just surprised by the value of 72 frames. My understanding is that peak_usb_ndo_start_xmit() should stop the network queue whenever the number of active tx urbs reaches 10. Ref: https://elixir.bootlin.com/linux/latest/source/drivers/net/can/usb/peak_usb/pcan_usb_core.c#L399 https://elixir.bootlin.com/linux/latest/source/drivers/net/can/usb/peak_usb/pcan_usb_core.h#L29 Yours sincerely, Vincent