Re: [PATCH rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 19, 2022 at 03:13:47PM +0530, Kashyap Desai wrote:
> > > All addresses other than 0xfffffffffff00000 are stale entries.
> > > Once this kind of incorrect page entries are passed to the RDMA h/w,
> > > AMD IOMMU module detects the page fault whenever h/w tries to access
> > > addresses which are not actually populated by the ib stack correctly.
> > > Below prints are logged whenever this issue hits.
> >
> > I don't understand this. AFAIK on AMD platforms you can't create an IOVA
> > mapping at -1 like you are saying above, so how is
> > 0xfffffffffff00000 a valid DMA address?
> 
> Hi Jason -
> 
> Let me simplify - Consider a case if 1 SGE has 8K dma_len and starting dma
> address is <0xffffffffffffe000>.
> It is expected to have two page table entry - <0xffffffffffffe000 > and
> <0xfffffffffffff000 >.
> Both the DMA address not mapped to -1.   Device expose dma_mask_bits = 64,
> so above two addresses are valid mapping from IOMMU perspective.

That is not my point.

My point is that 0xFFFFFFFFFFFFFFF should never be used as a DMA
address because it invites overflow on any maths, and we are not
careful about this in the kernel in general.

> Since end_dma_addr will be zero (in current code) which is actually not
> end_dma_addr but potential next_dma_addr, we will only endup set_page() call
> one time.

Which is the math overflow.

> I think this is a valid mapping request and don't see any issue with IOMMU
> mapping is incorrect.

It should not create mappings that are so dangerous. There is really
no reason to use the last page of IOVA space that includes -1.

> > And if we have to tolerate these addreses then the code should be
> > designed to avoid the overflow in the first place ie
> > 'end_dma_addr' should be changed to 'last_dma_addr = dma_addr +
> > (dma_len - 1)' which does not overflow, and all the logics
> > carefully organized so none of the math overflows.
> 
> Making 'last_dma_addr = dma_addr + (dma_len - 1)' will have another side
> effect. 

Yes, the patch would have to fix everything about the logic to work
with a last and avoid overflowing maths

> How about just doing below ? This was my initial thought of fixing but I am
> not sure which approach Is best.
> 
> diff --git a/drivers/infiniband/core/verbs.c
> b/drivers/infiniband/core/verbs.c
> index e54b3f1b730e..56d1f3b20e98 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -2709,7 +2709,7 @@ int ib_sg_to_pages(struct ib_mr *mr, struct
> scatterlist *sgl, int sg_nents,
>                         prev_addr = page_addr;
>  next_page:
>                         page_addr += mr->page_size;
> -               } while (page_addr < end_dma_addr);
> +               } while (page_addr < (end_dma_addr - 1));

This is now overflowing twice :(

Does this bug even still exist? eg does this revert "fix" it?

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=af3e9579ecfb

It makes me wonder if the use of -1 is why drivers started failing
with this mode.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux