On Thu, Feb 13, 2020 at 09:33:23AM -0500, Chuck Lever wrote: > > > > On Feb 12, 2020, at 2:30 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > > > > On Wed, Feb 12, 2020 at 02:09:03PM -0500, Chuck Lever wrote: > >> > >> > >>> On Feb 12, 2020, at 2:05 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > >>> > >>> On Wed, Feb 12, 2020 at 01:38:59PM -0500, Chuck Lever wrote: > >>>> > >>>>> On Feb 12, 2020, at 1:26 PM, Jason Gunthorpe <jgg@xxxxxxxx> wrote: > >>>>> > >>>>> On Wed, Feb 12, 2020 at 11:12:30AM -0500, Chuck Lever wrote: > >>>>>> The @nents value that was passed to ib_dma_map_sg() has to be passed > >>>>>> to the matching ib_dma_unmap_sg() call. If ib_dma_map_sg() choses to > >>>>>> concatenate sg entries, it will return a different nents value than > >>>>>> it was passed. > >>>>>> > >>>>>> The bug was exposed by recent changes to the AMD IOMMU driver, which > >>>>>> enabled sg entry concatenation. > >>>>>> > >>>>>> Looking all the way back to commit 4143f34e01e9 ("xprtrdma: Port to > >>>>>> new memory registration API") and reviewing other kernel ULPs, it's > >>>>>> not clear that the frwr_map() logic was ever correct for this case. > >>>>>> > >>>>>> Reported-by: Andre Tomt <andre@xxxxxxxx> > >>>>>> Suggested-by: Robin Murphy <robin.murphy@xxxxxxx> > >>>>>> Signed-off-by: Chuck Lever <chuck.lever@xxxxxxxxxx> > >>>>>> Cc: stable@xxxxxxxxxxxxxxx # v5.5 > >>>>>> net/sunrpc/xprtrdma/frwr_ops.c | 13 +++++++------ > >>>>>> 1 file changed, 7 insertions(+), 6 deletions(-) > >>>>> > >>>>> Yep > >>>>> > >>>>> Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxxxx> > >>>> > >>>> Thanks. > >>>> > >>>> Wondering if it makes sense to add a Fixes tag for the AMD IOMMU commit > >>>> where NFS/RDMA stopped working, rather than the "Cc: stable # v5.5". > >>>> > >>>> Fixes: be62dbf554c5 ("iommu/amd: Convert AMD iommu driver to the dma-iommu api") > >>> > >>> Not really, this was broken for other configurations besides AMD > >> > >> Agreed, but the bug seems to have been inconsequential until now? > > > > I imagine it would get you on ARM or other archs, IIRC. > > That's certainly plausible, but I haven't received explicit bug reports > in this area. (I'm not at all saying that such bugs categorically do > not exist). Usually I encourage people to put the fixes line to the commit that is being fixed, pointing at some other commit that happens to expose the bug is not the best. > In any event, practical matters: the posted patch applies back to v5.4, > but fails to apply starting with v5.3. > > I think we can leave the "Cc: stable # v5.5"; and I'm open to requests > to backport this simple fix onto earlier stable kernels (back to v4.4), > which can be handled case-by-case. 'Salright? I'd just put Cc: stable, the stable folks will reject it on earlier versions because of conflicts and we can leave it. Jason