Attempt 2, now as plain text... (vger doesn't like html) On 10/10/19 5:52 PM, Marc Kleine-Budde wrote: > On 10/10/19 2:17 PM, Marc Kleine-Budde wrote: >> From: Jeroen Hofstee <jhofstee@xxxxxxxxxxxxxxxxx> >> >> When the rx fifo overflows the ti_hecc would silently drop them since >> the overwrite protection is enabled for all mailboxes. So disable it >> for the lowest priority mailbox and increment the rx_fifo_errors when >> receive message lost is set. Drop the message itself in that case, >> since it might be partially updated. > Is that your observation or does the data sheet say anything to this > situation? I couldn't find in the data sheet, so I simply tested it, by allowing the highest mailbox to be overwritten and send a stream alternating with messages will all bits set and all cleared. That does end with canids from one message combined with data from another. >> Signed-off-by: Jeroen Hofstee <jhofstee@xxxxxxxxxxxxxxxxx> >> Signed-off-by: Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> >> --- >> drivers/net/can/ti_hecc.c | 21 +++++++++++++++++---- >> 1 file changed, 17 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/net/can/ti_hecc.c b/drivers/net/can/ti_hecc.c >> index 6ea29126c60b..c2d83ada203a 100644 >> --- a/drivers/net/can/ti_hecc.c >> +++ b/drivers/net/can/ti_hecc.c >> @@ -82,7 +82,7 @@ MODULE_VERSION(HECC_MODULE_VERSION); >> #define HECC_CANTA 0x10 /* Transmission acknowledge */ >> #define HECC_CANAA 0x14 /* Abort acknowledge */ >> #define HECC_CANRMP 0x18 /* Receive message pending */ >> -#define HECC_CANRML 0x1C /* Remote message lost */ >> +#define HECC_CANRML 0x1C /* Receive message lost */ >> #define HECC_CANRFP 0x20 /* Remote frame pending */ >> #define HECC_CANGAM 0x24 /* SECC only:Global acceptance mask */ >> #define HECC_CANMC 0x28 /* Master control */ >> @@ -385,8 +385,17 @@ static void ti_hecc_start(struct net_device *ndev) >> /* Enable tx interrupts */ >> hecc_set_bit(priv, HECC_CANMIM, BIT(HECC_MAX_TX_MBOX) - 1); >> >> - /* Prevent message over-write & Enable interrupts */ >> - hecc_write(priv, HECC_CANOPC, HECC_SET_REG); >> + /* Prevent message over-write to create a rx fifo, but not for >> + * the lowest priority mailbox, since that allows detecting >> + * overflows instead of the hardware silently dropping the >> + * messages. The lowest rx mailbox is one above the tx ones, >> + * hence its mbxno is the number of tx mailboxes. >> + */ >> + mbxno = HECC_MAX_TX_MBOX; >> + mbx_mask = ~BIT(mbxno); >> + hecc_write(priv, HECC_CANOPC, mbx_mask); >> + >> + /* Enable interrupts */ >> if (priv->use_hecc1int) { >> hecc_write(priv, HECC_CANMIL, HECC_SET_REG); >> hecc_write(priv, HECC_CANGIM, HECC_CANGIM_DEF_MASK | >> @@ -531,6 +540,7 @@ static unsigned int ti_hecc_mailbox_read(struct can_rx_offload *offload, >> { >> struct ti_hecc_priv *priv = rx_offload_to_priv(offload); >> u32 data, mbx_mask; >> + int lost; >> >> mbx_mask = BIT(mbxno); >> data = hecc_read_mbx(priv, mbxno, HECC_CANMID); >> @@ -552,9 +562,12 @@ static unsigned int ti_hecc_mailbox_read(struct can_rx_offload *offload, >> } >> >> *timestamp = hecc_read_stamp(priv, mbxno); >> + lost = hecc_read(priv, HECC_CANRML) & mbx_mask; >> + if (unlikely(lost)) >> + priv->offload.dev->stats.rx_fifo_errors++; > In the flexcan and at91_can driver we're incrementing the following errors: > dev->stats.rx_over_errors++; > dev->stats.rx_errors++; I understood it as follows, see[1] e.g.: rx_errors -> link level errors, not really applicable to CAN (perhaps in single shot mode or if you want (and can) report retransmissions) rx_over_errors -> the hardware itself cannot keep up. Not applicable for CAN. rx_fifo_errors -> the software driver cannot keep up. So I picked that one. rx_dropped -> software is dropping on purpose based on limits etc. But I might be wrong. > You can save the register access if you only check for overflows if > reading from the lowest prio mailbox. > > If you're discarding the data if the mailbox is marked as overflow > there's no need to read the data in the first place. > >> hecc_write(priv, HECC_CANRMP, mbx_mask); >> >> - return 1; >> + return !lost; >> } >> >> static int ti_hecc_error(struct net_device *ndev, int int_status, >> Mind it that you don't cause a race! The bit can become set during reading of the data, it should be check _after_ we have a copy of the mailbox. You can do a double check, one before one after, but since there should be no fifo overflow anyway, there is no reason to optimize for that path. (@250k I cannot get more then 3 messages in the fifo...). Regards, Jeroen [1] https://community.mellanox.com/s/article/counters-troubleshooting-for-linux-driver