> -----Original Message----- > From: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx> > Sent: Wednesday, October 30, 2024 5:23 PM > > net_dim() is currently passed a struct dim_sample argument by value. > struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64 passes it > on the stack. All callers have already initialized dim_sample on the stack, so > passing it by value requires pushing a duplicated copy to the stack. Either > witing to the stack and immediately reading it, or perhaps dereferencing > addresses relative to the stack pointer in a chain of push instructions, seems > to perform quite poorly. > > In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time, > 94% of which is attributed to the first push instruction to copy dim_sample on > the stack for the call to net_dim(): > // Call ktime_get() > 0.26 |4ead2: call 4ead7 <mlx5e_handle_rx_dim+0x47> > // Pass the address of struct dim in %rdi > |4ead7: lea 0x3d0(%rbx),%rdi > // Set dim_sample.pkt_ctr > |4eade: mov %r13d,0x8(%rsp) > // Set dim_sample.byte_ctr > |4eae3: mov %r12d,0xc(%rsp) > // Set dim_sample.event_ctr > 0.15 |4eae8: mov %bp,0x10(%rsp) > // Duplicate dim_sample on the stack > 94.16 |4eaed: push 0x10(%rsp) > 2.79 |4eaf1: push 0x10(%rsp) > 0.07 |4eaf5: push %rax > // Call net_dim() > 0.21 |4eaf6: call 4eafb <mlx5e_handle_rx_dim+0x6b> > > To allow the caller to reuse the struct dim_sample already on the stack, pass > the struct dim_sample by reference to net_dim(). > > Signed-off-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx> > --- Thank you for this patch. For the ENA part: Reviewed-by: Arthur Kiyanovski <akiyano@xxxxxxxxxx> Thanks, Arthur