On Thu, Mar 12, 2020 at 04:39:02PM +0100, Christian König wrote: > > > The structure for holding dma addresses doesn't really exist > > > in a generic form, but would be an array of these structures: > > > > > > struct dma_sg { > > > dma_addr_t addr; > > > u32 len; > > > }; > > Same question, RDMA needs to represent gigabytes of pages in a DMA > > list, we will need some generic way to handle that. I suspect GPU has > > a similar need? Can it be accomidated in some generic dma_sg? > > Yes, we easily have ranges of >1GB. So I would certainly say u64 for the len > here. To be clear, I mean specifically 1GB of dma map composed of 262k pages, mapped into 262k dma_sg's that take around some 4M of memory to represent as struct dma_dg. Really prefer some scheme that doesn't rely on vmalloc.. Some approach to have a single dma_sg > 4G seems less commonly needed? I don't think any RDMA HW today can handle a single SGL that large at least. > > - Add some generic dma_sg data structure and helper > > - Add dma mapping code to go from pages to dma_sg > > - Rework RDMA to use dma_sg and the new dma mapping code > > - Rework dmabuf to support dma mapping to a dma_sg > > - Rework GPU drivers to use dma_sg > > - Teach p2pdma to generate a dma_sg from a BAR page list > > - This series > > > > ? > > Sounds pretty much like a plan to me, but unfortunately like a rather huge > one. I know parts of this have been advancing.. CH has been working on fixing up the DMA layer enough to do #1 and #2, I think. > Because of this and cause I don't know if all drivers can live with dma_sg > I'm not sure if we shouldn't have the switch from scatterlist to dma_sg > separately to this peer2peer work. So far any attempts to make sgls without struct page have failed for various reasons. Too often obscure stuff does actually want the struct page. Stuffing BAR memory pages into the SGL is bad enough already. :( One pragmatic path might be to define this new 'dma_sg' in a way where it would have the same memory layout as a 'struct scatterlist' Something like struct dma_scatterlist { unsigned long link; unsigned int reserved1; #ifndef CONFIG_NEED_SG_DMA_LENGTH unsigned int dma_length; #else unsigned int reserved2; #endif dma_addr_t dma_address; #ifdef CONFIG_NEED_SG_DMA_LENGTH unsigned int dma_length; #endif }; struct dma_sg_table { union { struct dma_scatterlist *dma_sgl; struct future_more_efficient_structure *future; } unsigned int nents; }; Then a dma_map_sg could be struct dma_sg_table *dma_map_sg_attrs_to_dma( struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, unsigned long attrs) { ret = dma_map_sg_attrs(dev, sg, nents, dir, attrs); res = kmalloc(sizeof(dma_sg_table)); res->dma_sgl = sg; return res; } Then at least the work can get gets split up, I can switch RDMA drivers to use dma_sg_table, then I can switch the subsystem to call dma_map_sg_attrs_to_dma, then when we get dma_map_biovec_attrs() I can work on switching the input sgl to a biovec without changing the drivers. After enough conversions are done we can optimize the memory layout inside dma_sg_table, after everything is done we can drop support for 'dma_scatterlist' It doesn't feel objectionable to build a 'dma_sg_table' without a struct page. Jason