On pátek 2. února 2018 10:25:09 CET, Gerlando Falauto wrote:
I saw your patches about max310x and all the improvements you brought in to
achieve
burst transfers.
My goal would be to push the chip to its limits (i.e. to 2.000.000bps) on a
raspberry pi.
Hi Gerlando,
I'm happy that these patches help others as well :).
I had already noticed the inefficiencies and tried something similar to
your approach for burst transfers (but sticking to regmap).
I wonder how you did that? I looked at regmap, but it felt as if it wasn't
really possible to persuade regmap to just keep the SCLK line running for
batch-reading additional bytes from the same register (and just for
registers 0x00).
In both cases
(with or without regmap) it looks like the CPU is the bottleneck.
I was thinking of using DMA and your approach looks just one step away from
achieving this.
Regarding DMA, I have no clue how it works on RPi. I know that the SoC I
have has something similar, but it's severely limited and the kernel
doesn't really support it.
Did you ever consider this?
Here's what I think should help get you a better performance:
1) Ensure that the userspace actually configures the UART in a mode which
enables batched reads (also known as "SPI burst access"). This is a big
catch. By default, reading each byte requires two SPI transactions, one for
reading one byte from the RX FIFO, and another one for the Line Status
Register. The HW does not support "batch reading" the RX buffer along with
the LSR, unfortunately. This means that by default, each byte received by
the UART requires at least four bytes to be transmitted over SPI.
However, if the userspace tells the kernel that it is not interested in
checking the BREAK condition, in determining RX parity errors, etc, then
the kernel can skip (and does skip, at least in the tty-next tree) the LSR
register. And because it's only reading from the RXFIFO, it can leverage
that SPI burst access.
Here's a snippet of my code which does that `termios` configuration:
m_config.c_iflag = IGNPAR;
m_config.c_oflag = 0;
m_config.c_cflag = CLOCAL | CREAD | CS8 | B115200;
m_config.c_lflag = 0;
m_config.c_cc[VMIN] = 0;
m_config.c_cc[VTIME] = 1;
2) Check the SPI frequency. The RPi's documentation suggests that there are
some limitations of the maximal SPI frequency. It's apparently a bit
coarse, with a big gap between 16MHz and 32MHz. The chip that I use is
spec'ed to allow up to 26MHz, which means 16MHz on your system.
3) We can also improve the RX FIFO utilization. My device has a 128B
buffer, and when I added some debugging code to produce a histogram of the
actual watermark level when reading the RX FIFO, I was surprised to see
plenty of small transfers. That's because the current driver prefers to
read from the RX buffer "ASAP", as soon as there's something in there.
The HW also allows another mode of operation where it only raises an IRQ
once either:
- more than X bytes are in the FIFO,
- or any byte in the FIFO has been there for more than Y "periods", where a
"period" is the time it takes the UART to transmit/receive one byte.
Doing that would make a lot of sense in this context. If we always read
after, say, 32 bytes are in the buffer (or upon a matching timeout), then
it's likely that we will do 32byte SPI transactions much more often. That
should reduce the SPI utilization when reading by (asymptotically) 50%.
I plan to (eventually) send a patch doing just that, but ENOTIME for now.
Hope this helps.
With kind regards,
Jan
--
To unsubscribe from this list: send the line "unsubscribe linux-serial" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html