On 29.04.2022 23:31:28, Pavel Pisa wrote: > > Split into separate patches and applied. > > Excuse me for late reply and thanks much for split to preferred > form. Matej Vasilevski has tested updated linux-can-next testing > on Xilinx Zynq 7000 based MZ_APO board and used it with his > patches to do proceed next round of testing of Jan Charvat's NuttX > TWAI (CAN) driver on ESP32C3. We plan that CTU CAN FD timestamping > will be send for RFC/discussion soon. Sounds good! > I would like to thank to Andrew Dennison who implemented, tested > and shares integration with LiteX and RISC-V > > https://github.com/litex-hub/linux-on-litex-vexriscv > > He uses development version of the CTU CAN FD IP core with configurable > number of Tx buffers (2 to 8) for which will be required > automatic setup logic in the driver. > > I need to discuss with Ondrej Ille actual state and his plans. > But basically ntxbufs in the ctucan_probe_common() has to be assigned > from TXTB_INFO TXT_BUFFER_COUNT field. For older core version > the TXT_BUFFER_COUNT field bits should be equal to zero so when > value is zero, the original version with fixed 4 buffers will > be recognized. Makes sense > When value is configurable then for (uncommon) number > of buffers which is not power of two, there will be likely > a problem with way how buffers queue is implemented > > txtb_id = priv->txb_head % priv->ntxbufs; > ... > priv->txb_head++; > ... > priv->txb_tail++; > > When I have provided example for this type of queue many years > ago I have probably shown example with power of 2 masking, > but modulo by arbitrary number does not work with sequence > overflow. Which means to add there two "if"s unfortunately > > if (++priv->txb_tail == 2 * priv->ntxbufs) > priv->txb_tail = 0; There's another way to implement this, here for ring->obj_num being power of 2: | static inline u8 mcp251xfd_get_tx_head(const struct mcp251xfd_tx_ring *ring) | { | return ring->head & (ring->obj_num - 1); | } | | static inline u8 mcp251xfd_get_tx_tail(const struct mcp251xfd_tx_ring *ring) | { | return ring->tail & (ring->obj_num - 1); | } | | static inline u8 mcp251xfd_get_tx_free(const struct mcp251xfd_tx_ring *ring) | { | return ring->obj_num - (ring->head - ring->tail); | } If you want to allow not power of 2 ring->obj_num, use "% ring->obj_num" instead of "& (ring->obj_num - 1)". I'm not sure of there is a real world benefit (only gut feeling, should be measured) of using more than 4, but less than 8 TX buffers. You can make use of more TX buffers, if you implement (fully hardware based) TX IRQ coalescing (== handle more than one TX complete interrupt at a time) like in the mcp251xfd driver, or BQL support (== send more than one TX CAN frame at a time). I've played a bit with BQL support on the mcp251xfd driver (which is attached by SPI), but with mixed results. Probably an issue with proper configuration. > We need 2 * priv->ntxbufs range to distinguish empty and full queue... > But modulo is not nice either so I probably come with some other > solution in a longer term. In the long term, I want to implement > virtual queues to allow multiqueue to use dynamic Tx priority > of up to 8 the buffers... ACK, multiqueue TX support would be nice for things like the Earliest TX Time First scheduler (ETF). 1 TX queue for ETF, the other for bulk messages. regards, Marc -- Pengutronix e.K. | Marc Kleine-Budde | Embedded Linux | https://www.pengutronix.de | Vertretung West/Dortmund | Phone: +49-231-2826-924 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Attachment:
signature.asc
Description: PGP signature