Re: AMD IOMMU stops RDMA NFS from working since kernel 5.5 (bisected)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Feb 11, 2020, at 10:32 AM, Robin Murphy <robin.murphy@xxxxxxx> wrote:
> 
> On 11/02/2020 3:24 pm, Chuck Lever wrote:
>>> On Feb 11, 2020, at 10:12 AM, Robin Murphy <robin.murphy@xxxxxxx> wrote:
>>> 
>>> On 11/02/2020 1:48 pm, Chuck Lever wrote:
>>>> Andre-
>>>> Thank you for the detailed report!
>>>> Tom-
>>>> There is a rich set of trace points available in the RPC/RDMA implementation in 5.4/5.5, fwiw.
>>>> Please keep me in the loop, let me know if there is anything I can do to help.
>>> 
>>> One aspect that may be worth checking is whether there's anywhere that assumes a successful return value from dma_map_sg() is always the same as the number of entries passed in - that's the most obvious way the iommu-dma code differs (legitimately) from the previous amd-iommu implementation.
>> net/sunrpc/xprtrdma/frwr_ops.c: frwr_map()
>> 317         mr->mr_nents =
>> 318                 ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir);
>> 319         if (!mr->mr_nents)
>> 320                 goto out_dmamap_err;
>> Should that rather be "if (mr->mr_nents != i)" ?
> 
> No, that much is OK - the point is that dma_map_sg() may pack the DMA addresses such that sg_dma_len(sg) > sg->length - however, subsequently passing that mr->nents to dma_unmap_sg() in frwr_mr_recycle() (rather than the original value of i) looks at a glance like an example of how things may start to get out-of-whack.

Robin, your explanation makes sense to me. I can post a fix for this imbalance later today for Andre to try.


--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux