Re: [PATCH 1/7] can: rx-offload: continue on error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Marc,

On 10/9/19 3:18 PM, Marc Kleine-Budde wrote:
> Hello Jeroen,
>
> I'm currently looking at the rx-offload error handling in detail.
>
> TLDR: I've added the patch to linux-can.
>
> Thanks,
> Marc
>
> For the record, the details:
>
> On 9/24/19 8:45 PM, Jeroen Hofstee wrote:
>> While can_rx_offload_offload_one will call mailbox_read to discard
>> the mailbox in case of an error,
> Yes.
>
> can_rx_offload_offload_one() will read into the discard mailbox in case
> of an error.
>
> Currently there are two kinds of errors:
> 1) the rx-offoad skb queue (between the IRQ handler and the NAPI)
>     is full
> 2) alloc_can_skb() fails to allocate a skb, due to OOM
>
>> can_rx_offload_irq_offload_timestamp bails out in the error case.
> Yes, in case of an error both can_rx_offload_irq_offload_timestamp() and
> can_rx_offload_irq_offload_fifo() will stop reading mailboxes, add the
> already filled skbs to the queue and schedule NAPI if needed.
>
> Currently both can_rx_offload_irq_offload_timestamp() and
> can_rx_offload_irq_offload_fifo() will return the number of queued
> mailboxes.
>
> This means in case of queue overflow or OOM, only one mailbox is
> discaded, before can_rx_offload_irq_offload_*() returning the number of
> successfully queued mailboxes so far.
>
> Looking at the flexcan driver:
>
> https://elixir.bootlin.com/linux/latest/source/drivers/net/can/flexcan.c#L867
>
>> 		while ((reg_iflag = flexcan_read_reg_iflag_rx(priv))) {
>> 			handled = IRQ_HANDLED;
>> 			ret = can_rx_offload_irq_offload_timestamp(&priv->offload,
>> 								   reg_iflag);
>> 			if (!ret)
>> 				break;
>> 		}
> [...]
>> 		if (reg_iflag1 & FLEXCAN_IFLAG_RX_FIFO_AVAILABLE) {
>> 			handled = IRQ_HANDLED;
>> 			can_rx_offload_irq_offload_fifo(&priv->offload);
>> 		}
> This means for the timestamp mode, at least one mailbox is discarded or
> if the error occurred after reading one or more mailboxes the while loop
> will try again. If the error persists a second mailbox is discarded.
>
> For the fifo mode, only one mailbox is discarded.
>
> Then the flexcan's IRQ is exited. If there are messages in mailboxes are
> pending another IRQ is triggered.... I doubt that this is a good idea.
>
> On the other hand the ti_hecc driver:
>
>> 		/* offload RX mailboxes and let NAPI deliver them */
>> 		while ((rx_pending = hecc_read(priv, HECC_CANRMP))) {
>> 			can_rx_offload_irq_offload_timestamp(&priv->offload,
>> 							     rx_pending);
>> 			hecc_write(priv, HECC_CANRMP, rx_pending);
>> 		}
> The error is ignored and the _all_ mailboxes are discarded (given the
> error persists).
>
>> Since it is typically called from a while loop, all message will
>> eventually be discarded. So lets continue on error instead to discard
>> them directly.
> After reading my own code and writing it up, your patch totally makes sense.
>
> If there is a shortage of resources, queue overrun or OOM, it makes no
> sense to return from the IRQ handler, if a mailbox is still active as it
> will trigger the IRQ again. Entering the IRQ handler again probably
> doesn't give the system time to recover from the resource problem.


Indeed, I have nothing to comment on that, besides thanks for
being willing to reconsider your own code.

With kind regards,

Jeroen






[Index of Archives]     [Automotive Discussions]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [CAN Bus]

  Powered by Linux