RE: [PATCH rdma-rc v1] RDMA/core: fix sg_to_page mapping for boundary condition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> > All addresses other than 0xfffffffffff00000 are stale entries.
> > Once this kind of incorrect page entries are passed to the RDMA h/w,
> > AMD IOMMU module detects the page fault whenever h/w tries to access
> > addresses which are not actually populated by the ib stack correctly.
> > Below prints are logged whenever this issue hits.
>
> I don't understand this. AFAIK on AMD platforms you can't create an IOVA
> mapping at -1 like you are saying above, so how is
> 0xfffffffffff00000 a valid DMA address?

Hi Jason -

Let me simplify - Consider a case if 1 SGE has 8K dma_len and starting dma
address is <0xffffffffffffe000>.
It is expected to have two page table entry - <0xffffffffffffe000 > and
<0xfffffffffffff000 >.
Both the DMA address not mapped to -1.   Device expose dma_mask_bits = 64,
so above two addresses are valid mapping from IOMMU perspective.

Since end_dma_addr will be zero (in current code) which is actually not
end_dma_addr but potential next_dma_addr, we will only endup set_page() call
one time.

I think this is a valid mapping request and don't see any issue with IOMMU
mapping is incorrect.

>
> Or, if the AMD IOMMU HW can actually do this, then I would say it is a bug
> in
> the IOMM DMA API to allow the aperture used for DMA mapping to get to the
> end of ULONG_MAX, it is just asking for overflow bugs.
>
> And if we have to tolerate these addreses then the code should be designed
> to
> avoid the overflow in the first place ie 'end_dma_addr'
> should be changed to 'last_dma_addr = dma_addr + (dma_len - 1)' which does
> not overflow, and all the logics carefully organized so none of the math
> overflows.

Making 'last_dma_addr = dma_addr + (dma_len - 1)' will have another side
effect. Driver may get call of set_page() more than max_pg_ptrs.
I have not debug how it can happen, but just wanted to share result with
you.  I can check if that is a preferred path.
'end_dma_addr' is used in code for other arithmetic.

How about just doing below ? This was my initial thought of fixing but I am
not sure which approach Is best.

diff --git a/drivers/infiniband/core/verbs.c
b/drivers/infiniband/core/verbs.c
index e54b3f1b730e..56d1f3b20e98 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2709,7 +2709,7 @@ int ib_sg_to_pages(struct ib_mr *mr, struct
scatterlist *sgl, int sg_nents,
                        prev_addr = page_addr;
 next_page:
                        page_addr += mr->page_size;
-               } while (page_addr < end_dma_addr);
+               } while (page_addr < (end_dma_addr - 1));

                mr->length += dma_len;
                last_end_dma_addr = end_dma_addr;

>
> Jason

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux