I pulled in the c_can patches from the for-kurt branch (63574e9 thru bf01f717) and tested on my device. The number of overruns are noticeably fewer; however, the overall system performance seems to have slowed down. For example, the console response and Bluetooth data rate are noticeably slower. I also noticed that while the number of overruns decreased, the number of errors increased: BEFORE: # ip -details -statistics link show can0 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000 link/can promiscuity 435268 can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 bitrate 250000 sample-point 0.875 tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1 c_can: tseg1 2..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1 clock 24000000 re-started bus-errors arbit-lost error-warn error-pass bus-off 0 0 0 3 3 0 numtxqueues 435300 gso_max_size 435364 gso_max_segs 435400 RX: bytes packets errors dropped overrun mcast 3086416 385802 1246 0 1246 0 TX: bytes packets errors dropped carrier collsns 20 5 0 0 0 0 AFTER (use rx_offload in IRQ handler): # ip -details -statistics link show can0 3: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000 link/can promiscuity 435268 can state ERROR-ACTIVE (berr-counter tx 0 rx 0) restart-ms 0 bitrate 250000 sample-point 0.875 tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1 c_can: tseg1 2..16 tseg2 1..8 sjw 1..4 brp 1..1024 brp-inc 1 clock 24000000 re-started bus-errors arbit-lost error-warn error-pass bus-off 0 0 0 3 3 0 numtxqueues 435300 gso_max_size 435364 gso_max_segs 435400 RX: bytes packets errors dropped overrun mcast 6366640 795839 27906 0 39 0 TX: bytes packets errors dropped carrier collsns 20 5 0 0 0 0 --elenita On Thu, Oct 17, 2019 at 11:13 AM Elenita Hinds <ecathinds@xxxxxxxxx> wrote: > > Thanks, Marc and Kurt. > > >> Can you give me a pointer to this driver? Are you talking about the > >> mainline linux-3.1? > > I sit corrected on this one. Further investigation indicates that this > driver (authored by TI for AM335x) was never submitted to mainline > Linux 3.1. > Here is the link to the driver: > https://github.com/calixtosystems/linux-am335x/tree/master/linux-am33x/drivers/net/can/d_can. > > >> The algorithm that tries to read the CAN frames in the correct order was > >> added in v3.15. That algorithm should be ported to the rx-offload > >> helper. This way the mailboxes can be read in interrupt context and not > >> from NAPI (which runs in softirq context only). > >> > >> If this basically works, it can be extended to support 64 mailboxes. > > I'm facing overflows, and have good results with the patchset I sent > > last week. As Marc says, reading the mailboxes in NAPI softirq may cause > > regular overflows. > > Ok. I see the newer changes to the c_can driver and will be trying those. > > --elenita > > > On Thu, Oct 17, 2019 at 2:47 AM Kurt Van Dijck > <dev.kurt@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > > > On wo, 16 okt 2019 23:01:41 +0200, Marc Kleine-Budde wrote: > > > On 10/16/19 9:06 PM, Elenita Hinds wrote: > > > > I'm hoping someone can help me with this ... > > > > > > > > The DCAN module I'm testing with supports 64 CAN messages and DMA. The > > > > combined c_can/d_can driver seem to only support 16 RX message objects > > > > and no DMA (as far as I can tell). > > > > > > ACK > > > > > > > I noticed that older Linux version > > > > (3.1) implemented a separate d_can driver that supports both. I'm > > > > wondering why these were removed from the latest c_can/d_can files. > > > > > > Can you give me a pointer to this driver? Are you talking about the > > > mainline linux-3.1? > > > > > > > The reason for this question is I'm seeing frame losses and I think it > > > > is due to the driver. Increasing the socket buffer sizes did not have > > > > any effect; the number is overruns was still pretty large. > > > > > > > > Any feedback would be appreciated. > > > > > > The problem with the c_can and d_can is, that it doesn't have a proper > > > FIFO but only mailboxes. And these mailboxes don't implement a > > > timestamp, so that it's not that easy to read messages in the correct order. > > > > > > Does the device support bus master DMA? As CAN messages are quite small, > > > the overhead of setting up DMA might be too high. > > > > > > The algorithm that tries to read the CAN frames in the correct order was > > > added in v3.15. That algorithm should be ported to the rx-offload > > > helper. This way the mailboxes can be read in interrupt context and not > > > from NAPI (which runs in softirq context only). > > > > > > If this basically works, it can be extended to support 64 mailboxes. > > > > I'm facing overflows, and have good results with the patchset I sent > > last week. As Marc says, reading the mailboxes in NAPI softirq may cause > > regular overflows. > > > > Kurt