This is a follow-up of the initial problems encountered trying to get
the AMD Epyc 7401server to do host to host communication through NTB.
(please see thread for background info).
The IO_PAGE_FAULT flags=0x0070 seen on write ops was in fact related to
proxy ID setup as Logan had suggested. The AMD iommu code only processed
the 'last' proxy ID/dma alias; the last proxy ID was associated with
Reads and this allowed Read ops to succeed and Write ops to fail. Adding
support to process all of the proxy IDs in the AMD iommu code (plus
adding dma_map_resource support), the AMD Epyc server can now be
configured in a 4 host NTB setup and communicate over NTB (tcp/ip over
ntb_netdev) to the other 3 hosts.
The problem that we are now experiencing, for which I can use some help,
with the AMD Epyc 7401 server is very poor iperf performance over
NTB/ntb_netdev.
The iperf numbers over NTB start off initially at around 800 Mbits/s and
quickly degrades down to the 20 Mbits/s range. Running 'top' during
iperf, I see many instances (up to 25+) of ksoftirqd running which
suggests that interrupts are overwhelming the interrupt processing.
/proc/interrupts show lots of 'ccp-5' dma interrupt activity as well as
ntb_netdev interrupt activity. After eliminating netdev interrupts by
configuring netdev to 'use_poll' and leaving ccp, the poor iperf
performance persists.
As a comparison, I can replace the ccp dma with the plx dma (found on
the host adapter card) on the AMD server and get a steady 9.4 Gbits/s
with iperf over NTB.
I've optmimized for numa via numactl in all test runs.
So it appears that the iperf NTB performance issues on the AMD Epyc
server are related to the ccp dma and its interrupt processing.
Does anyone have any experience with the ccp dma that might be able to help?
Any help or suggestions on how to proceed would be very much appreciated.
Thanks
Kit
kchow@xxxxxxxxxx