On Tue, Aug 21, 2018 at 4:35 PM, Logan Gunthorpe <logang@xxxxxxxxxxxx> wrote: > > > On 21/08/18 05:28 PM, Eric Pilmore wrote: >> >> >> On Tue, Aug 21, 2018 at 4:20 PM, Logan Gunthorpe <logang@xxxxxxxxxxxx >> <mailto:logang@xxxxxxxxxxxx>> wrote: >> >> >> >> On 21/08/18 05:18 PM, Eric Pilmore wrote: >> > We have been running locally with Kit's change for dma_map_resource and its >> > incorporation in ntb_async_tx_submit for the destination address and >> > it runs fine >> > under "load" (iperf) on a Xeon (Xeon(R) CPU E5-2680 v4 @ 2.40GHz) based system, >> > regardless of whether the DMA engine being used is IOAT or a PLX >> > device sitting in >> > the PCIe tree. However, when we go back to a i7 (i7-7700K CPU @ 4.20GHz) based >> > system it seems to run into issues, specifically when put under a >> > load. In this case, >> > just having a load using a single ping command with an interval=0, i.e. no delay >> > between ping packets, after a few thousand packets the system just hangs. No >> > panic or watchdogs. Note that in this scenario I can only use a PLX DMA engine. >> >> This is just my best guess: but it sounds to me like a bug in the PLX >> DMA driver or hardware. >> >> >> The PLX DMA driver? But the PLX driver isn't really even involved in >> the mapping >> stage. Are you thinking maybe the stage at which the DMA descriptor is >> freed and >> the PLX DMA driver does a dma_descriptor_unmap? > > Hmm, well what would make you think the hang is during > mapping/unmapping? Well, the only difference between success and failure is running with the call to dma_map_resource for the destination address, which is a PCI BAR address. Prior to Kit introducing this call, we never created a mapping for the destination PCI BAR address and it worked fine on all systems when using PLX DMA. It was only when we went to a Xeon system and attempted to use IOAT DMA that we found we needed a mapping for that destination PCI BAR address. The only thing the PLX driver does related to "mappings" is a call to dma_descriptor_unmap when the descriptor is freed, however that is more of an administrative step to clean up the unmap-data data structure used when the mapping was originally established. > I would expect a hang to be in handling completions > from the DMA engine or something like that. > >> Again, PLX did not exhibit any issues on the Xeon system. > > Oh, I missed that. That puts a crinkle in my theory but, as you say, it > could be a timing issue. > > Also, it's VERY strange that it would hang the entire system. That makes > things very hard to debug... Tell me about it! ;-) Eric