On 10.12.24 12:49, Dragos Tatulea wrote: > > > On 06.12.24 16:25, Alexandra Winter wrote: >> >> >> On 04.12.24 15:36, Eric Dumazet wrote: >>> I would suggest the opposite : copy the headers (typically less than >>> 128 bytes) on a piece of coherent memory. >>> >>> As a bonus, if skb->len is smaller than 256 bytes, copy the whole skb. >>> >>> include/net/tso.h and net/core/tso.c users do this. >>> >>> Sure, patch is going to be more invasive, but all arches will win. >> >> >> Thank you very much for the examples, I think I understand what you are proposing. >> I am not sure whether I'm able to map it to the mlx5 driver, but I could >> try to come up with a RFC. It may take some time though. >> >> NVidia people, any suggesttions? Do you want to handle that yourselves? >> > Discussed with Saeed and he proposed another approach that is better for > us: copy the whole skb payload inline into the WQE if it's size is below a > threshold. This threshold can be configured through the > tx-copybreak mechanism. > > Thanks, > Dragos Thank you very much Dargos and Saeed. I am not sure I understand the details of "inline into the WQE". The idea seems to be to use a premapped coherent array per WQ that is indexed by queue element index and can be used to copy headers and maybe small messages into. I think I see something similar to your proposal in mlx4 (?). To me the general concept seems to be similar to what Eric is proposing. Did I get it right? I really like the idea to use tx-copybreak for threshold configuration. As Eric mentioned that is not a very small patch and maybe not fit for backporting to older distro versions. What do you think of a two-step approach as described in the other sub-thread? A simple patch for mitigation that can be backported, and then the improvement as a replacement? Thanks, Alexandra