Re: [PATCH rdma-next 3/6] RDMA/umem: Add API to return optimal HW DMA addresses from SG list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 24, 2018 at 04:32:24PM -0600, Shiraz Saleem wrote:
> This helper iterates the SG list and returns suitable
> HW aligned DMA addresses within a driver supported
> page size. The implementation is intended to work for
> HW that support single page sizes or mixed page sizes
> in an MR. This avoids the need for having driver specific
> algorithms to achieve the same thing and redundant walks
> of the SG list.
> 
> Suggested-by: Jason Gunthorpe <jgg@xxxxxxxx>
> Reviewed-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
> Signed-off-by: Shiraz Saleem <shiraz.saleem@xxxxxxxxx>
>  drivers/infiniband/core/umem.c | 78 ++++++++++++++++++++++++++++++++++++++++++
>  include/rdma/ib_umem.h         | 23 +++++++++++++
>  2 files changed, 101 insertions(+)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index b2f2d75..8f8a1f6 100644
> +++ b/drivers/infiniband/core/umem.c
> @@ -204,6 +204,84 @@ unsigned long ib_umem_find_single_pg_size(struct ib_umem *umem,
>  }
>  EXPORT_SYMBOL(ib_umem_find_single_pg_size);
>  
> +static unsigned int ib_umem_find_mixed_pg_bit(struct scatterlist *sgl_head,
> +					      struct sg_phys_iter *sg_phys_iter)
> +{
> +	unsigned long dma_addr_start, dma_addr_end;
> +
> +	dma_addr_start = sg_dma_address(sg_phys_iter->sg);
> +	dma_addr_end = sg_dma_address(sg_phys_iter->sg) +
> +		       sg_dma_len(sg_phys_iter->sg);
> +
> +	if (sg_phys_iter->sg == sgl_head)
> +		return ib_umem_find_pg_bit(dma_addr_end,
> +					   sg_phys_iter->supported_pgsz);
> +	else if (sg_is_last(sg_phys_iter->sg))
> +		return ib_umem_find_pg_bit(dma_addr_start,
> +					   sg_phys_iter->supported_pgsz);
> +	else
> +		return ib_umem_find_pg_bit(sg_phys_iter->phyaddr,
> +					   sg_phys_iter->supported_pgsz);
> +}
> +void ib_umem_start_phys_iter(struct ib_umem *umem,
> +			     struct sg_phys_iter *sg_phys_iter,
> +			     unsigned long supported_pgsz)
> +{
> +	memset(sg_phys_iter, 0, sizeof(struct sg_phys_iter));
> +	sg_phys_iter->sg = umem->sg_head.sgl;
> +	sg_phys_iter->supported_pgsz = supported_pgsz;
> +
> +	/* Single page support in MR */
> +	if (hweight_long(supported_pgsz) == 1)
> +		sg_phys_iter->pg_bit = fls64(supported_pgsz) - 1;
> +	else
> +		sg_phys_iter->mixed_pg_support = true;
> +}
> +EXPORT_SYMBOL(ib_umem_start_phys_iter);
> +
> +/**
> + * ib_umem_next_phys_iter - SG list iterator that returns aligned HW address
> + * @umem: umem struct
> + * @sg_phys_iter: SG HW address iterator
> + * @supported_pgsz: bitmask of HW supported page sizes
> + *
> + * This helper iterates over the SG list and returns the HW
> + * address aligned to a supported HW page size.

I think more documentation is appropriate here..

Each true result returns a contiguous aligned block of memory such
that:

- pg_bit indicate the alignment of this block such that
    phyaddr & ((1 << pg_bit) - 1) == 0

- All blocks except the starting block has a zero offset.
  For the starting block offset indicates the first valid byte in the
  MR, HW should not permit access to bytes earlier that offset.

  [And how is this being done in i40iw? A side effect of the user
   virtual bad address?]

- For all blocks except the last len + offset equals 1 << pg_bit

False is returned when iteration is completed and all blocks have been
seen.

> + */
> +bool ib_umem_next_phys_iter(struct ib_umem *umem,
> +			    struct sg_phys_iter *sg_phys_iter)
> +{
> +	unsigned long pg_mask;
> +
> +	if (!sg_phys_iter->sg || !sg_phys_iter->supported_pgsz)
> +		return false;
> +
> +	if (sg_phys_iter->remaining) {
> +		sg_phys_iter->phyaddr += (sg_phys_iter->len + sg_phys_iter->offset);
> +	} else {
> +		sg_phys_iter->phyaddr = sg_dma_address(sg_phys_iter->sg);
> +		sg_phys_iter->remaining = sg_dma_len(sg_phys_iter->sg);
> +	}
> +
> +	/* Mixed page support in MR*/
> +	if (sg_phys_iter->mixed_pg_support)
> +		sg_phys_iter->pg_bit = ib_umem_find_mixed_pg_bit(umem->sg_head.sgl,
> +								 sg_phys_iter);
> +
> +	pg_mask = ~(BIT_ULL(sg_phys_iter->pg_bit) - 1);
> +
> +	sg_phys_iter->offset = sg_phys_iter->phyaddr & ~pg_mask;
> +	sg_phys_iter->phyaddr = sg_phys_iter->phyaddr & pg_mask;
> +	sg_phys_iter->len = min_t(unsigned long, sg_phys_iter->remaining,
> +			BIT_ULL(sg_phys_iter->pg_bit) - sg_phys_iter->offset);

Maybe drop len and offset? Nothing uses them in this series, right?

Seems fine otherwise

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux