On Tue. 14 janv. 2021 at 01:04, Marc Kleine-Budde <mkl@xxxxxxxxxxxxxx> wrote: > > On 1/13/21 1:15 PM, Vincent MAILHOL wrote: > >>> +/** > >>> + * es58x_calculate_crc() - Compute the crc16 of a given URB. > >>> + * @urb_cmd: The URB command for which we want to calculate the CRC. > >>> + * @urb_len: Length of @urb_cmd. Must be at least bigger than 4 > >>> + * (ES58X_CRC_CALC_OFFSET + sizeof(crc)) > >>> + * > >>> + * Return: crc16 value. > >>> + */ > >>> +static u16 es58x_calculate_crc(const union es58x_urb_cmd *urb_cmd, u16 urb_len) > >>> +{ > >>> + u16 crc; > >>> + ssize_t len = urb_len - ES58X_CRC_CALC_OFFSET - sizeof(crc); > >>> + > >>> + WARN_ON(len < 0); > >> > >> Is it possible to ensure earlier, that the urbs are of correct length? > > > > Easy answer: it is ensured. > > Okay, then get rid of those checks :) > > > On the Tx branch, I create the urbs so I > > know for sure that the length is correct. On the Rx branch, I have a > > dedicated function: es58x_check_rx_urb() for this purpose. I > > will remove that WARN_ON() and the one in es58x_get_crc(). > > > > I will also check the other WARN_ON() in my code to see if they > > can be removed (none on my test throughout the last ten months or > > so could trigger any of these WARN_ON() so should be fine to > > remove but I will double check). Checked, all WARN_ON() will be removed in v11. > >>> +struct es58x_priv { > >>> + struct can_priv can; > >>> + struct es58x_device *es58x_dev; > >>> + struct urb *tx_urb; > >>> + > >>> + spinlock_t echo_skb_spinlock; /* Comments: c.f. supra */ > >>> + u32 current_packet_idx; > >>> + u16 echo_skb_tail_idx; > >>> + u16 echo_skb_head_idx; > >>> + u16 num_echo_skb; > >> > >> Can you explain me how the tx-path works, especially why you need the > >> current_packet_idx. > >> > >> In the mcp251xfd driver, the number of TX buffers is a power of two, that makes > >> things easier. tx_heads % len points to the next buffer to be filled, tx_tail % > >> len points to the next buffer to be completed. tx_head - tx_tail is the fill > >> level of the FIFO. This works without spinlocks. > > > > For what I understand of your explanations here are the equivalences > > between the etas_es58x and the mcp251xfd drivers: > > > > +--------------------+-------------------+ > > | etas_es58x | mcp251xfd | > > +--------------------+-------------------+ > > | current_packet_idx | tx_head | > > | echo_skb_tail_idx | tx_tail % len | > > | echo_skb_head_idx | tx_head % len | > > | num_echo_skb | tx_head - tx_tail | > > +--------------------+-------------------+ > > > > Especially, the current_packet_idx is sent to the device and returned > > to the driver upon completion. > > Is current_packet_idx used only for the TX-PATH? It is used in the RX path of loopback packet. When a packet comes back, its index should be equal to current_packet_idx - num_echo_skb. I use this in es58x_can_get_echo_skb() to check that there are no packet drops. Of course, if the FIFO size is a power of two, the current_packet_idx would become useless. > > I wish the TX buffers were a power of two which is unfortunately not > > the case. The theoretical TX buffer sizes are 330 and 500 for the two > > devices so I wrote the code to work with those values. The exact size > > of the TX buffer is actually even more of a mystery because during > > testing both devices were unstable when using the theoretical values > > and I had to lower these. There is a comment at the bottom of > > es581_4.c and es58x_fd.c to reflect those issues. > > What are the performance penalties for using 256 for the fd and 64 ofr the other? I checked my passed log, actually, I had good results with 256 on the FD. I lowered it to 255 with no strong reasons. For the classical CAN changing from 75 to 64 should still be enough to reach full busload. > > Because I do not > > have access to the source code of the firmware, I could not identify > > the root cause. > > ok > > > My understanding is that having a queue size being a power of two is > > required in order not to use spinlocks (else, modulo operations would > > break when the index wraparound back to zero). I tried to minimize the > > number of spinlock: only one per bulk send or bulk receive. > > With queue size being power of two the modulo can be written as a mask > operation, so it's reasonable fast. So you only need to care about tx_head and > tx_tail, and there is only one writer for each variable. With a little dance and > barriers when stopping and starting the queue it's race-free without spinlocks. Yep, I checked linux/kfifo.h in the past. I think I understand the theory, now I need to practice. I will try to do the change. Will get back to you later when it is done and tested. Yours sincerely, Vincent