While debugging low throughput on a tcan4x5x chip using a logic analyzer, I found that the SPI bus is silent for the *vast* majority of time spent sending or receiving a CAN frame. Each SPI transfer takes ~5 microseconds, but there's an order of magnitude more time (50-60 microseconds) between them. This doesn't seem to be caused by any sort of contention - it happens on a SPI bus with a single chip select and no other drivers accessing it. Presumably there's a (relatively) large fixed cost to request a transfer from the SPI controller on the hardware I'm using (an NVIDIA Jetson platform). Let's improve throughput by combining FIFO reads and writes into larger transfers - one for ID & DLC, one for the frame data - instead of handling single words at a time. We could reduce the number of transfers further by batching certain control register reads, but this is an easy place to start, since FIFO registers are contiguous. Since TX and RX time is dominated by the fixed, per-transfer delays mentioned above, this nets substantial performance improvements - about 20% faster for small CAN frames and nearly 5x faster for max size (64 byte) CAN FD frames. Matt Kline (2): can: m_can: Batch FIFO reads during CAN receive can: m_can: Batch FIFO writes during CAN send drivers/net/can/m_can/m_can.c | 77 +++++++++++++------------- drivers/net/can/m_can/m_can.h | 4 +- drivers/net/can/m_can/m_can_pci.c | 11 ++-- drivers/net/can/m_can/m_can_platform.c | 11 ++-- drivers/net/can/m_can/tcan4x5x-core.c | 12 ++-- 5 files changed, 60 insertions(+), 55 deletions(-) -- 2.31.1