Re: AMD Epyc iperf perfomance issues over NTB

Kit Chow <kchow@xxxxxxxxxx> · Fri, 6 Sep 2019 16:48:32 -0700

This is a follow-up of the initial problems encountered trying to get 
the AMD Epyc 7401server to do host to host communication through NTB. 
(please see thread for background info).

The IO_PAGE_FAULT flags=0x0070 seen on write ops was in fact related to 
proxy ID setup as Logan had suggested. The AMD iommu code only processed 
the 'last' proxy ID/dma alias; the last proxy ID was associated with 
Reads and this allowed Read ops to succeed and Write ops to fail. Adding 
support to process all of the proxy IDs in the AMD iommu code (plus 
adding dma_map_resource support), the AMD Epyc server can now be 
configured in a 4 host NTB setup and communicate over NTB (tcp/ip over 
ntb_netdev) to the other 3 hosts.

The problem that we are now experiencing, for which I can use some help, 
with the AMD Epyc 7401 server is very poor iperf performance over 
NTB/ntb_netdev.

The iperf numbers over NTB start off initially at around 800 Mbits/s and 
quickly degrades down to the 20 Mbits/s range. Running 'top' during 
iperf, I see many instances (up to 25+) of ksoftirqd running which 
suggests that interrupts are overwhelming the interrupt processing.

/proc/interrupts show lots of 'ccp-5' dma interrupt activity as well as 
ntb_netdev interrupt activity. After eliminating netdev interrupts by 
configuring netdev to 'use_poll' and leaving ccp, the poor iperf 
performance persists.

As a comparison, I can replace the ccp dma with the plx dma (found on 
the host adapter card) on the AMD server and get a steady 9.4 Gbits/s 
with iperf over NTB.

I've optmimized for numa via numactl in all test runs.

So it appears that the iperf NTB performance issues on the AMD Epyc 
server are related to the ccp dma and its interrupt processing.

Does anyone have any experience with the ccp dma that might be able to help?

Any help or suggestions on how to proceed would be very much appreciated.

Thanks
Kit

kchow@xxxxxxxxxx