On Wed, Oct 26, 2022 at 02:59:04PM +0200, Greg Kroah-Hartman wrote: > On Wed, Oct 26, 2022 at 10:48:58AM +0100, Catalin Marinas wrote: > > On Wed, Oct 26, 2022 at 08:50:07AM +0200, Greg Kroah-Hartman wrote: > > > On Tue, Oct 25, 2022 at 09:52:47PM +0100, Catalin Marinas wrote: > > > > --- a/lib/kasprintf.c > > > > +++ b/lib/kasprintf.c > > > > @@ -22,7 +22,7 @@ char *kvasprintf(gfp_t gfp, const char *fmt, va_list ap) > > > > first = vsnprintf(NULL, 0, fmt, aq); > > > > va_end(aq); > > > > > > > > - p = kmalloc_track_caller(first+1, gfp); > > > > + p = kmalloc_track_caller(first+1, gfp | __GFP_PACKED); > > > > > > How do we know this is going to be small? > > > > We don't need to know it's small. If it's over 96 bytes on arm64, it > > goes in the kmalloc-128 cache or higher. It can even use the kmalloc-192 > > cache that's not aligned to ARCH_DMA_MINALIGN (128). That's why I'd > > avoid GFP_TINY as this flag is not about size but rather alignment (e.g. > > 192 may not be DMA safe but it's larger than 128). > > > > That said, I should try to identify sizes > 128 and <= 192 and pass such > > flag. > > What if the flag is used for large sizes, what will happen? In other > words, why would you ever NOT want to use this? DMA is a big issue, but > then we should flip that around and explicitly mark the times we want > DMA, not not-want DMA as "not want" is by far the most common, right? Indeed, flipping these flags is the ideal solution. It's just tracking them down and I'm not sure coccinelle on its own can handle it (maybe it does). As an example of what needs changing: ----------------------8<------------------------- diff --git a/drivers/infiniband/hw/irdma/osdep.h b/drivers/infiniband/hw/irdma/osdep.h index fc1ba2a3e6fb..8ba94d563db3 100644 --- a/drivers/infiniband/hw/irdma/osdep.h +++ b/drivers/infiniband/hw/irdma/osdep.h @@ -16,7 +16,7 @@ struct irdma_dma_info { }; struct irdma_dma_mem { - void *va; + void __dma *va; dma_addr_t pa; u32 size; } __packed; diff --git a/drivers/infiniband/hw/irdma/puda.c b/drivers/infiniband/hw/irdma/puda.c index 4ec9639f1bdb..ab15c5e812d0 100644 --- a/drivers/infiniband/hw/irdma/puda.c +++ b/drivers/infiniband/hw/irdma/puda.c @@ -143,13 +143,13 @@ static struct irdma_puda_buf *irdma_puda_alloc_buf(struct irdma_sc_dev *dev, struct irdma_virt_mem buf_mem; buf_mem.size = sizeof(struct irdma_puda_buf); - buf_mem.va = kzalloc(buf_mem.size, GFP_KERNEL); + buf_mem.va = dma_kzalloc(buf_mem.size, GFP_KERNEL); if (!buf_mem.va) return NULL; buf = buf_mem.va; buf->mem.size = len; - buf->mem.va = kzalloc(buf->mem.size, GFP_KERNEL); + buf->mem.va = dma_kzalloc(buf->mem.size, GFP_KERNEL); if (!buf->mem.va) goto free_virt; buf->mem.pa = dma_map_single(dev->hw->device, buf->mem.va, diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index 0ee20b764000..8476e6609f35 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -324,7 +324,7 @@ static inline void dma_free_noncoherent(struct device *dev, size_t size, dma_free_pages(dev, size, virt_to_page(vaddr), dma_handle, dir); } -static inline dma_addr_t dma_map_single_attrs(struct device *dev, void *ptr, +static inline dma_addr_t dma_map_single_attrs(struct device *dev, void __dma *ptr, size_t size, enum dma_data_direction dir, unsigned long attrs) { /* DMA must never operate on areas that might be remapped. */ ----------------------8<------------------------- Basically any pointer type passed to dma_map_single() would need the __dma attribute. Once that's done, the next step is changing the allocator from kmalloc() to a new dma_kmalloc(). There are other places where the pointer gets assigned the value of another pointer (e.g. skb->data), so the origin pointer need to inherit the __dma attribute (and its original allocator changed). The scatterlist API may need changing slightly as it works on pages + offsets. > In other words, I'm leaning hard on the "mark allocations that need DMA > memory as needing DMA memory" option. Yes it will be more work > initially, but I bet a lot if not most, of them can be caught with > coccinelle scripts. And then enforced with sparse to ensure they don't > break in the future. I'll read a bit more on this. > That would provide a huge hint to the allocator as to what is needed, > while this "packed" really doesn't express the intent here at all. Then > if your allocator/arch has explicit requirements for DMA memory, it can > enforce them and not just hope and guess that people mark things as "not > dma". "Packed" is more about saying "I don't need the ARCH_KMALLOC_MINALIGN alignment". That's why I went for it instead of GFP_NODMA (which usually leads to your question of why not doing it the other way around). -- Catalin