On Mon, Jan 22, 2024 at 08:00:59AM -0800, Souradeep Chakrabarti wrote: > Existing MANA design assigns IRQ to every CPU, including sibling > hyper-threads. This may cause multiple IRQs to be active simultaneously > in the same core and may reduce the network performance. > > Improve the performance by assigning IRQ to non sibling CPUs in local > NUMA node. The performance improvement we are getting using ntttcp with > following patch is around 15 percent against existing design and > approximately 11 percent, when trying to assign one IRQ in each core > across NUMA nodes, if enough cores are present. > The change will improve the performance for the system > with high number of CPU, where number of CPUs in a node is more than > 64 CPUs. Nodes with 64 CPUs or less than 64 CPUs will not be affected > by this change. > > The performance study was done using ntttcp tool in Azure. > The node had 2 nodes with 32 cores each, total 128 vCPU and number of channels > were 32 for 32 RX rings. > > The below table shows a comparison between existing design and new > design: > > IRQ node-num core-num CPU performance(%) > 1 0 | 0 0 | 0 0 | 0-1 0 > 2 0 | 0 0 | 1 1 | 2-3 3 > 3 0 | 0 1 | 2 2 | 4-5 10 > 4 0 | 0 1 | 3 3 | 6-7 15 > 5 0 | 0 2 | 4 4 | 8-9 15 > --- > --- > 25 0 | 0 12| 24 24| 48-49 12 > --- > 32 0 | 0 15| 31 31| 62-63 12 > 33 0 | 0 16| 0 32| 0-1 10 > --- > 64 0 | 0 31| 31 63| 62-63 0 Did that omitted lines mean 5-24 : 15%, 25-31 : 12% and 33-63 : 10%? Or that means that you didn't test those? Would be nice to have full coverage... Thanks, Yury