From: Jose Abreu <joabreu@xxxxxxxxxxxx> Date: Jul/27/2019, 16:56:37 (UTC+00:00) > From: Jon Hunter <jonathanh@xxxxxxxxxx> > Date: Jul/26/2019, 15:11:00 (UTC+00:00) > > > > > On 25/07/2019 16:12, Jose Abreu wrote: > > > From: Jon Hunter <jonathanh@xxxxxxxxxx> > > > Date: Jul/25/2019, 15:25:59 (UTC+00:00) > > > > > >> > > >> On 25/07/2019 14:26, Jose Abreu wrote: > > >> > > >> ... > > >> > > >>> Well, I wasn't expecting that :/ > > >>> > > >>> Per documentation of barriers I think we should set descriptor fields > > >>> and then barrier and finally ownership to HW so that remaining fields > > >>> are coherent before owner is set. > > >>> > > >>> Anyway, can you also add a dma_rmb() after the call to > > >>> stmmac_rx_status() ? > > >> > > >> Yes. I removed the debug print added the barrier, but that did not help. > > > > > > So, I was finally able to setup NFS using your replicated setup and I > > > can't see the issue :( > > > > > > The only difference I have from yours is that I'm using TCP in NFS > > > whilst you (I believe from the logs), use UDP. > > > > So I tried TCP by setting the kernel boot params to 'nfsvers=3' and > > 'proto=tcp' and this does appear to be more stable, but not 100% stable. > > It still appears to fail in the same place about 50% of the time. > > > > > You do have flow control active right ? And your HW FIFO size is >= 4k ? > > > > How can I verify if flow control is active? > > You can check it by dumping register MTL_RxQ_Operation_Mode (0xd30). > > Can you also add IOMMU debug in file "drivers/iommu/iommu.c" ? And, please try attached debug patch. --- Thanks, Jose Miguel Abreu
Attachment:
0001-net-page_pool-Do-not-skip-CPU-sync.patch
Description: 0001-net-page_pool-Do-not-skip-CPU-sync.patch