On Sun, Feb 09, 2025 at 02:26:08PM +0000, Michael Margolin wrote: > A single scatter-gather entry is limited by a 32 bits "length" field > that is practically 4GB - PAGE_SIZE. This means that even when the > memory is physically contiguous, we might need more than one entry to > represent it. Additionally when using dmabuf, the sg_table might be > originated outside the subsystem and optimized for other needs. > > For instance an SGT of 16GB GPU continuous memory might look like this: > (a real life example) > > dma_address 34401400000, length fffff000 > dma_address 345013ff000, length fffff000 > dma_address 346013fe000, length fffff000 > dma_address 347013fd000, length fffff000 > dma_address 348013fc000, length 4000 > > Since ib_umem_find_best_pgsz works within SG entries, in the above case > we will result with the worst possible 4KB page size. > > Fix this by taking into consideration only the alignment of addresses of > real discontinuity points rather than treating SG entries as such, and > adjust the page iterator to correctly handle cross SG entry pages. > > There is currently an assumption that drivers do not ask for pages > bigger than maximal DMA size supported by their devices. > > Reviewed-by: Firas Jahjah <firasj@xxxxxxxxxx> > Reviewed-by: Yonatan Nachum <ynachum@xxxxxxxxxx> > Signed-off-by: Michael Margolin <mrgolin@xxxxxxxxxx> > --- > drivers/infiniband/core/umem.c | 34 +++++++++++++++++++++++---------- > drivers/infiniband/core/verbs.c | 11 ++++++----- > 2 files changed, 30 insertions(+), 15 deletions(-) Applied with the following change to prevent arithmetic overflow. diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index e7e428369159..63a92d6cfbc2 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -112,8 +112,7 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, /* If the current entry is physically contiguous with the previous * one, no need to take its start addresses into consideration. */ - if (curr_base + curr_len != sg_dma_address(sg)) { - + if (curr_base != sg_dma_address(sg) - curr_len) { curr_base = sg_dma_address(sg); curr_len = 0;