Re: [PATCH WIP 28/43] IB/core: Introduce new fast registration API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 23, 2015 at 01:15:16PM +0300, Sagi Grimberg wrote:
> >I was hoping we'd move the DMA flush and translate into here and make
> >it mandatory. Is there any reason not to do that?
> 
> The reason I didn't added it in was so the ULPs can make sure they meet
> the restrictions of ib_map_mr_sg(). Allow SRP to iterate on his
> SG list set partials and iSER to detect gaps (they need to dma map
> for that).

The ULP can always get the sg list's virtual address to check for
gaps. Page aligned gaps are always OK.

BTW, the logic in ib_sg_to_pages should be checking that directly, as
coded, it won't work with swiotlb:

// Only the first SG entry can start unaligned
if (i && page_addr != dma_addr)
    return EINVAL;
// Only the last SG entry can end unaligned
if ((page_addr + dma_len) & PAGE_MASK != end_dma_addr)
 if (!is_last)
     return EINVAL;

Don't use sg->offset after dma mapping.

The biggest problem with checking the virtual address is
swiotlb. However, if swiotlb is used this API is basically broken as
swiotlb downgrades everything to a 2k alignment, which means we only
ever get 1 s/g entry.

To efficiently support swiotlb we'd need to see the driver be able to
work with a page size of IO_TLB_SEGSIZE (2k) so it can handle the
de-alignment that happens during bouncing.

My biggest problem with pushing the dma address up to the ULP is
basically that: The ULP has no idea what the driver can handle, maybe
the driver can handle the 2k pages.

So, that leaves a flow where the ULP does a basic sanity check on the
virtual side, then asks the IB core to map it. The mapping could still
fail because of swiotlb.

If the mapping fails, then the ULP has to bounce buffer, or MR split,
or totally fail.

For bounce buffer, all solutions have a DMA map/unmap cost, so it
doesn't matter if ib_map_mr_sg does that internally.

For MR fragment, the DMA mapping is still usable. Maybe we do need a
slightly different core API to help MR fragmentation? Sounds like NFS
uses this too?

 num_mrs = ib_mr_fragment_sg(&scatterlist);
 while (..)
   ib_map_fragment_sg(mr[i++],&scatterlist,&offset);

Perhaps?

Maybe that is even better because something like iser could do
the parallel:
 ib_mr_needs_fragment_sg(reference_mr,&scatterlist)

Which hides all the various restrictions in driver code.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux