Re: [PATCH v1] xprtrdma: Fix DMA scatter-gather list mapping imbalance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andre, thanks for trying this out.

> On Feb 11, 2020, at 3:50 PM, Andre Tomt <andre@xxxxxxxx> wrote:
> 
> On 11.02.2020 20:58, Chuck Lever wrote:
>> The @nents value that was passed to ib_dma_map_sg() has to be passed
>> to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to
>> concatenate sg entries, it will return a different nents value than
>> it was passed.
>> The bug was exposed by recent changes to the AMD IOMMU driver.
> 
> This seems to fail differently on my system; mount fails with:
> mount.nfs: mount system call failed
> 
> and the kernel log reports:
> [   38.890344] NFS: Registering the id_resolver key type
> [   38.890351] Key type id_resolver registered
> [   38.890352] Key type id_legacy registered
> [   38.901799] NFS: nfs4_discover_server_trunking unhandled error -5. Exiting with error EIO
> [   38.901817] NFS4: Couldn't follow remote path
> 
> amd_iommu=off still works
> 
> One detail I accidentally left out of the original report is that the server (intel system) is running Ubuntu 20.04 ("beta") userspace, and AMD clients are Ubuntu 19.10 userspace. Although I dont believe this to matter at this point.

Next thing to try:

# trace-cmd record -e sunrpc -e rpcrdma

then issue the mount command. Once it completes, ^C the trace-cmd and send me trace.dat.

Try this with both the v5.4 kernel and the v5.5 kernel (and note that trace-cmd overwrites trace.dat, so copy it out between tests).


>> Reported-by: Andre Tomt <andre@xxxxxxxx>
>> Suggested-by: Robin Murphy <robin.murphy@xxxxxxx>
>> Fixes: 1f541895dae9 ("xprtrdma: Don't defer MR recovery if ro_map fails")
>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx>
>> ---
>>  net/sunrpc/xprtrdma/frwr_ops.c |    5 ++---
>>  1 file changed, 2 insertions(+), 3 deletions(-)
>> Hey Andre, please try this out. It just reverts the bit of brokenness that
>> Robin observed this morning. I've done basic testing here with Intel
>> IOMMU systems, no change in behavior (ie, all good to go).
>> diff --git a/net/sunrpc/xprtrdma/frwr_ops.c b/net/sunrpc/xprtrdma/frwr_ops.c
>> index 095be887753e..449bb51e4fe8 100644
>> --- a/net/sunrpc/xprtrdma/frwr_ops.c
>> +++ b/net/sunrpc/xprtrdma/frwr_ops.c
>> @@ -313,10 +313,9 @@ struct rpcrdma_mr_seg *frwr_map(struct rpcrdma_xprt *r_xprt,
>>  			break;
>>  	}
>>  	mr->mr_dir = rpcrdma_data_dir(writing);
>> +	mr->mr_nents = i;
>>  -	mr->mr_nents =
>> -		ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir);
>> -	if (!mr->mr_nents)
>> +	if (!ib_dma_map_sg(ia->ri_id->device, mr->mr_sg, i, mr->mr_dir))
>>  		goto out_dmamap_err;
>>    	ibmr = mr->frwr.fr_mr;
> 

--
Chuck Lever







[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux