On Tue, Sep 29, 2020 at 9:46 PM Chris Goldsworthy <cgoldswo@xxxxxxxxxxxxxx> wrote: > > On 2020-09-25 21:24, John Stultz wrote: > > Reuse/abuse the pagepool code from the network code to speed > > up allocation performance. > > > > This is similar to the ION pagepool usage, but tries to > > utilize generic code instead of a custom implementation. > > > > Cc: Sumit Semwal <sumit.semwal@xxxxxxxxxx> > > Cc: Liam Mark <lmark@xxxxxxxxxxxxxx> > > Cc: Laura Abbott <labbott@xxxxxxxxxx> > > Cc: Brian Starkey <Brian.Starkey@xxxxxxx> > > Cc: Hridya Valsaraju <hridya@xxxxxxxxxx> > > Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx> > > Cc: Sandeep Patil <sspatil@xxxxxxxxxx> > > Cc: Ørjan Eide <orjan.eide@xxxxxxx> > > Cc: Robin Murphy <robin.murphy@xxxxxxx> > > Cc: Ezequiel Garcia <ezequiel@xxxxxxxxxxxxx> > > Cc: Simon Ser <contact@xxxxxxxxxxx> > > Cc: James Jones <jajones@xxxxxxxxxx> > > Cc: linux-media@xxxxxxxxxxxxxxx > > Cc: dri-devel@xxxxxxxxxxxxxxxxxxxxx > > Signed-off-by: John Stultz <john.stultz@xxxxxxxxxx> > > --- > > drivers/dma-buf/heaps/Kconfig | 1 + > > drivers/dma-buf/heaps/system_heap.c | 32 +++++++++++++++++++++++++---- > > 2 files changed, 29 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/dma-buf/heaps/Kconfig > > b/drivers/dma-buf/heaps/Kconfig > > index a5eef06c4226..f13cde4321b1 100644 > > --- a/drivers/dma-buf/heaps/Kconfig > > +++ b/drivers/dma-buf/heaps/Kconfig > > @@ -1,6 +1,7 @@ > > config DMABUF_HEAPS_SYSTEM > > bool "DMA-BUF System Heap" > > depends on DMABUF_HEAPS > > + select PAGE_POOL > > help > > Choose this option to enable the system dmabuf heap. The system > > heap > > is backed by pages from the buddy allocator. If in doubt, say Y. > > diff --git a/drivers/dma-buf/heaps/system_heap.c > > b/drivers/dma-buf/heaps/system_heap.c > > index 882a632e9bb7..9f57b4c8ae69 100644 > > --- a/drivers/dma-buf/heaps/system_heap.c > > +++ b/drivers/dma-buf/heaps/system_heap.c > > @@ -20,6 +20,7 @@ > > #include <linux/scatterlist.h> > > #include <linux/slab.h> > > #include <linux/vmalloc.h> > > +#include <net/page_pool.h> > > > > struct dma_heap *sys_heap; > > > > @@ -46,6 +47,7 @@ struct dma_heap_attachment { > > static gfp_t order_flags[] = {HIGH_ORDER_GFP, LOW_ORDER_GFP, > > LOW_ORDER_GFP}; > > static const unsigned int orders[] = {8, 4, 0}; > > #define NUM_ORDERS ARRAY_SIZE(orders) > > +struct page_pool *pools[NUM_ORDERS]; > > > > static struct sg_table *dup_sg_table(struct sg_table *table) > > { > > @@ -264,13 +266,17 @@ static void system_heap_dma_buf_release(struct > > dma_buf *dmabuf) > > struct system_heap_buffer *buffer = dmabuf->priv; > > struct sg_table *table; > > struct scatterlist *sg; > > - int i; > > + int i, j; > > > > table = &buffer->sg_table; > > for_each_sg(table->sgl, sg, table->nents, i) { > > struct page *page = sg_page(sg); > > > > - __free_pages(page, compound_order(page)); > > + for (j = 0; j < NUM_ORDERS; j++) { > > + if (compound_order(page) == orders[j]) > > + break; > > + } > > + page_pool_put_full_page(pools[j], page, false); > > } > > sg_free_table(table); > > kfree(buffer); > > @@ -300,8 +306,7 @@ static struct page > > *alloc_largest_available(unsigned long size, > > continue; > > if (max_order < orders[i]) > > continue; > > - > > - page = alloc_pages(order_flags[i], orders[i]); > > + page = page_pool_alloc_pages(pools[i], order_flags[i]); > > if (!page) > > continue; > > return page; > > @@ -406,6 +411,25 @@ static const struct dma_heap_ops system_heap_ops = > > { > > static int system_heap_create(void) > > { > > struct dma_heap_export_info exp_info; > > + int i; > > + > > + for (i = 0; i < NUM_ORDERS; i++) { > > + struct page_pool_params pp; > > + > > + memset(&pp, 0, sizeof(pp)); > > + pp.order = orders[i]; > > + pp.dma_dir = DMA_BIDIRECTIONAL; > > + pools[i] = page_pool_create(&pp); > > + > > + if (IS_ERR(pools[i])) { > > + int j; > > + > > + pr_err("%s: page pool creation failed!\n", __func__); > > + for (j = 0; j < i; j++) > > + page_pool_destroy(pools[j]); > > + return PTR_ERR(pools[i]); > > + } > > + } > > > > exp_info.name = "system"; > > exp_info.ops = &system_heap_ops; > > This is cool, I didn't know about this pooling code under /net/core. > Nice and compact. Oh, bummer. I just realized when allocating w/ __GFP_ZERO from the page-pool, the logic doesn't actually clear pages when pulling from the cache. So unfortunately this is what accounts for much of the performance benefit I was seeing with this approach, so I'll have to retract my claim on the performance gain with this. :( I've got a first pass at zeroing the pages we put into the pool, but the numbers are not so great just yet so I've got some further work to do. thanks -john