On Wed, 2014-02-19 at 13:06 -0800, Ben Widawsky wrote: > On Wed, Feb 19, 2014 at 09:11:46PM +0200, Imre Deak wrote: > > On Wed, 2014-02-12 at 14:28 -0800, Ben Widawsky wrote: > > > The previous allocation mechanism would get 2 contiguous allocations, > > > one for the page directories, and one for the page tables. As each page > > > table is 1 page, and there are 512 of these per page directory, this > > > goes to 1MB. An unfriendly request at best. Worse still, our HW now > > ---^ > > Fwiw, 2MB. > > Thanks. > > > > > > supports 4 page directories, and a 2MB allocation is not allowed. > > > > > > In order to fix this, this patch attempts to split up each page table > > > allocation into a single, discrete allocation. There is nothing really > > > fancy about the patch itself, it just has to manage an extra pointer > > > indirection, and have a fancier bit of logic to free up the pages. > > > > > > To accommodate some of the added complexity, two new helpers are > > > introduced to allocate, and free the page table pages. > > > > > > NOTE: I really wanted to split the way we do allocations, and the way in > > > which we identify the page table/page directory being used. I found > > > splitting this functionality up to be too unwieldy. I apologize in > > > advance to the reviewer. I'd recommend looking at the result, rather > > > than the diff. > > > > > > v2/NOTE2: This patch predated commit: > > > 6f1cc993518462ccf039e195fabd47e7aa5bfd13 > > > Author: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx> > > > Date: Tue Dec 31 15:50:31 2013 +0000 > > > > > > drm/i915: Avoid dereference past end of page arr > > > > > > It fixed the same issue as that patch, but because of the limbo state of > > > PPGTT, Chris patch was merged instead. The excess churn is a result of > > > my using my original patch, which has my preferred naming. Primarily > > > act_* is changed to which_*, but it's mostly the same otherwise. I've > > > kept the convention Chris used for the pte wrap (I had something > > > slightly different, and broken - but fixable) > > > > > > Signed-off-by: Ben Widawsky <ben@xxxxxxxxxxxx> > > > --- > > > drivers/gpu/drm/i915/i915_drv.h | 5 +- > > > drivers/gpu/drm/i915/i915_gem_gtt.c | 127 ++++++++++++++++++++++++++++-------- > > > 2 files changed, 103 insertions(+), 29 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > > > index 2ebad96..d9a6327 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.h > > > +++ b/drivers/gpu/drm/i915/i915_drv.h > > > @@ -691,6 +691,7 @@ struct i915_gtt { > > > }; > > > #define gtt_total_entries(gtt) ((gtt).base.total >> PAGE_SHIFT) > > > > > > +#define GEN8_LEGACY_PDPS 4 > > > struct i915_hw_ppgtt { > > > struct i915_address_space base; > > > struct kref ref; > > > @@ -698,14 +699,14 @@ struct i915_hw_ppgtt { > > > unsigned num_pd_entries; > > > union { > > > struct page **pt_pages; > > > - struct page *gen8_pt_pages; > > > + struct page **gen8_pt_pages[GEN8_LEGACY_PDPS]; > > > }; > > > struct page *pd_pages; > > > int num_pd_pages; > > > int num_pt_pages; > > > union { > > > uint32_t pd_offset; > > > - dma_addr_t pd_dma_addr[4]; > > > + dma_addr_t pd_dma_addr[GEN8_LEGACY_PDPS]; > > > }; > > > union { > > > dma_addr_t *pt_dma_addr; > > > diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > index 5bfc6ff..5299acc 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_gtt.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c > > > @@ -64,7 +64,19 @@ typedef gen8_gtt_pte_t gen8_ppgtt_pde_t; > > > > > > #define GEN8_PTES_PER_PAGE (PAGE_SIZE / sizeof(gen8_gtt_pte_t)) > > > #define GEN8_PDES_PER_PAGE (PAGE_SIZE / sizeof(gen8_ppgtt_pde_t)) > > > -#define GEN8_LEGACY_PDPS 4 > > > + > > > +/* GEN8 legacy style addressis defined as a 3 level page table: > > > + * 31:30 | 29:21 | 20:12 | 11:0 > > > + * PDPE | PDE | PTE | offset > > > + * The difference as compared to normal x86 3 level page table is the PDPEs are > > > + * programmed via register. > > > + */ > > > +#define GEN8_PDPE_SHIFT 30 > > > +#define GEN8_PDPE_MASK 0x3 > > > +#define GEN8_PDE_SHIFT 21 > > > +#define GEN8_PDE_MASK 0x1ff > > > +#define GEN8_PTE_SHIFT 12 > > > +#define GEN8_PTE_MASK 0x1ff > > > > > > #define PPAT_UNCACHED_INDEX (_PAGE_PWT | _PAGE_PCD) > > > #define PPAT_CACHED_PDE_INDEX 0 /* WB LLC */ > > > @@ -261,32 +273,36 @@ static void gen8_ppgtt_clear_range(struct i915_address_space *vm, > > > struct i915_hw_ppgtt *ppgtt = > > > container_of(vm, struct i915_hw_ppgtt, base); > > > gen8_gtt_pte_t *pt_vaddr, scratch_pte; > > > - unsigned first_entry = start >> PAGE_SHIFT; > > > + unsigned which_pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; > > > + unsigned which_pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; > > > + unsigned which_pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK; > > > unsigned num_entries = length >> PAGE_SHIFT; > > > - unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE; > > > - unsigned first_pte = first_entry % GEN8_PTES_PER_PAGE; > > > unsigned last_pte, i; > > > > > > scratch_pte = gen8_pte_encode(ppgtt->base.scratch.addr, > > > I915_CACHE_LLC, use_scratch); > > > > > > while (num_entries) { > > > - struct page *page_table = &ppgtt->gen8_pt_pages[act_pt]; > > > + struct page *page_table = ppgtt->gen8_pt_pages[which_pdpe][which_pde]; > > > > > > - last_pte = first_pte + num_entries; > > > + last_pte = which_pte + num_entries; > > > if (last_pte > GEN8_PTES_PER_PAGE) > > > last_pte = GEN8_PTES_PER_PAGE; > > > > > > pt_vaddr = kmap_atomic(page_table); > > > > > > - for (i = first_pte; i < last_pte; i++) > > > + for (i = which_pte; i < last_pte; i++) { > > > pt_vaddr[i] = scratch_pte; > > > + num_entries--; > > > + BUG_ON(num_entries < 0); > > > > num_entries is unsigned. > > This was already changed per Chris' request. > > > > > > + } > > > > > > kunmap_atomic(pt_vaddr); > > > > > > - num_entries -= last_pte - first_pte; > > > - first_pte = 0; > > > - act_pt++; > > > + which_pte = 0; > > > + if (which_pde + 1 == GEN8_PDES_PER_PAGE) > > > + which_pdpe++; > > > + which_pde = (which_pde + 1) & GEN8_PDE_MASK; > > > } > > > } > > > > > > @@ -298,39 +314,57 @@ static void gen8_ppgtt_insert_entries(struct i915_address_space *vm, > > > struct i915_hw_ppgtt *ppgtt = > > > container_of(vm, struct i915_hw_ppgtt, base); > > > gen8_gtt_pte_t *pt_vaddr; > > > - unsigned first_entry = start >> PAGE_SHIFT; > > > - unsigned act_pt = first_entry / GEN8_PTES_PER_PAGE; > > > - unsigned act_pte = first_entry % GEN8_PTES_PER_PAGE; > > > + unsigned which_pdpe = start >> GEN8_PDPE_SHIFT & GEN8_PDPE_MASK; > > > + unsigned which_pde = start >> GEN8_PDE_SHIFT & GEN8_PDE_MASK; > > > + unsigned which_pte = start >> GEN8_PTE_SHIFT & GEN8_PTE_MASK; > > > struct sg_page_iter sg_iter; > > > > > > pt_vaddr = NULL; > > > + > > > for_each_sg_page(pages->sgl, &sg_iter, pages->nents, 0) { > > > + if (WARN_ON(which_pdpe >= GEN8_LEGACY_PDPS)) > > > + break; > > > + > > > if (pt_vaddr == NULL) > > > - pt_vaddr = kmap_atomic(&ppgtt->gen8_pt_pages[act_pt]); > > > + pt_vaddr = kmap_atomic(ppgtt->gen8_pt_pages[which_pdpe][which_pde]); > > > > > > - pt_vaddr[act_pte] = > > > + pt_vaddr[which_pte] = > > > gen8_pte_encode(sg_page_iter_dma_address(&sg_iter), > > > cache_level, true); > > > - if (++act_pte == GEN8_PTES_PER_PAGE) { > > > + if (++which_pte == GEN8_PTES_PER_PAGE) { > > > kunmap_atomic(pt_vaddr); > > > pt_vaddr = NULL; > > > - act_pt++; > > > - act_pte = 0; > > > + if (which_pde + 1 == GEN8_PDES_PER_PAGE) > > > + which_pdpe++; > > > > Afaics which_pde = (which_pde + 1) & GEN8_PDE_MASK; is missing here. > > > > This was already changed per Chris' request. > > > > + which_pte = 0; > > > > > > > > > } > > > } > > > if (pt_vaddr) > > > kunmap_atomic(pt_vaddr); > > > } > > > > > > -static void gen8_ppgtt_free(struct i915_hw_ppgtt *ppgtt) > > > +static void gen8_free_page_tables(struct page **pt_pages) > > > +{ > > > + int i; > > > + > > > + if (pt_pages == NULL) > > > + return; > > > + > > > + for (i = 0; i < GEN8_PDES_PER_PAGE; i++) > > > + if (pt_pages[i]) > > > + __free_pages(pt_pages[i], 0); > > > +} > > > + > > > +static void gen8_ppgtt_free(const struct i915_hw_ppgtt *ppgtt) > > > { > > > int i; > > > > > > - for (i = 0; i < ppgtt->num_pd_pages ; i++) > > > + for (i = 0; i < ppgtt->num_pd_pages; i++) { > > > + gen8_free_page_tables(ppgtt->gen8_pt_pages[i]); > > > kfree(ppgtt->gen8_pt_dma_addr[i]); > > > + } > > > > > > kfree(ppgtt->gen8_pt_dma_addr); > > > - __free_pages(ppgtt->gen8_pt_pages, get_order(ppgtt->num_pt_pages << PAGE_SHIFT)); > > > __free_pages(ppgtt->pd_pages, get_order(ppgtt->num_pd_pages << PAGE_SHIFT)); > > > } > > > > > > @@ -369,20 +403,59 @@ static void gen8_ppgtt_cleanup(struct i915_address_space *vm) > > > gen8_ppgtt_free(ppgtt); > > > } > > > > > > +static struct page **__gen8_alloc_page_tables(void) > > > +{ > > > + struct page **pt_pages; > > > + int i; > > > + > > > + pt_pages = kcalloc(GEN8_PDES_PER_PAGE, sizeof(struct page *), GFP_KERNEL); > > > + if (!pt_pages) > > > + return ERR_PTR(-ENOMEM); > > > + > > > + for (i = 0; i < GEN8_PDES_PER_PAGE; i++) { > > > + pt_pages[i] = alloc_page(GFP_KERNEL); > > > + if (!pt_pages[i]) > > > + goto bail; > > > + } > > > + > > > + return pt_pages; > > > + > > > +bail: > > > + gen8_free_page_tables(pt_pages); > > > + kfree(pt_pages); > > > + return ERR_PTR(-ENOMEM); > > > +} > > > + > > > static int gen8_ppgtt_allocate_page_tables(struct i915_hw_ppgtt *ppgtt, > > > const int max_pdp) > > > { > > > - struct page *pt_pages; > > > + struct page **pt_pages[GEN8_LEGACY_PDPS]; > > > const int num_pt_pages = GEN8_PDES_PER_PAGE * max_pdp; > > > + int i, ret; > > > > > > - pt_pages = alloc_pages(GFP_KERNEL, get_order(num_pt_pages << PAGE_SHIFT)); > > > - if (!pt_pages) > > > - return -ENOMEM; > > > + for (i = 0; i < max_pdp; i++) { > > > + pt_pages[i] = __gen8_alloc_page_tables(); > > > + if (IS_ERR(pt_pages[i])) { > > > + ret = PTR_ERR(pt_pages[i]); > > > + goto unwind_out; > > > + } > > > + } > > > + > > > + /* NB: Avoid touching gen8_pt_pages until last to keep the allocation, > > > + * "atomic" - for cleanup purposes. > > > + */ > > > + for (i = 0; i < max_pdp; i++) > > > + ppgtt->gen8_pt_pages[i] = pt_pages[i]; > > > > > > - ppgtt->gen8_pt_pages = pt_pages; > > > ppgtt->num_pt_pages = 1 << get_order(num_pt_pages << PAGE_SHIFT); > > > > > > return 0; > > > + > > > +unwind_out: > > > + while(i--) > > > + gen8_free_page_tables(pt_pages[i]); > > > > I guess Ville commented on this issue, but pt_pages would be leaked > > here. > > I think Ville was referring to a different issue. The PPGTT struct > itself isn't freed (if I understood his point correctly, which was > indeed an issue). Forgive my ignorance here but I don't see where the > leak is. __gen8_alloc_page_tables() appears to always free if it failed, Yes that cleanup inside __gen8_alloc_page_tables is ok. > and unwind out should work backwards. Can you show me what I've missed? As I understand after pt_pages[i] = __gen8_alloc_page_tables(); we have a kcalloc()'d buffer in pt_pages[i]. Each entry of this buffer points to an alloc_page()'d page. Then on error at unwind_out for 0..i-1 we call gen8_free_page_tables() which will do only a __free_pages() on each of the above alloc_page'd entry. But then we still have the kcalloc'd buffer, which I can't see being freed anywhere. --Imre > > > > > > + > > > + return ret; > > > } > > > > > > static int gen8_ppgtt_allocate_dma(struct i915_hw_ppgtt *ppgtt) > > > @@ -475,7 +548,7 @@ static int gen8_ppgtt_setup_page_tables(struct i915_hw_ppgtt *ppgtt, > > > struct page *p; > > > int ret; > > > > > > - p = &ppgtt->gen8_pt_pages[pd * GEN8_PDES_PER_PAGE + pt]; > > > + p = ppgtt->gen8_pt_pages[pd][pt]; > > > pt_addr = pci_map_page(ppgtt->base.dev->pdev, > > > p, 0, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); > > > ret = pci_dma_mapping_error(ppgtt->base.dev->pdev, pt_addr); > > > > > _______________________________________________ Intel-gfx mailing list Intel-gfx@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/intel-gfx