On Mon, Nov 7, 2011 at 12:35 PM, Rob Clark <robdclark@xxxxxxxxx> wrote: > On Thu, Nov 3, 2011 at 6:39 PM, <j.glisse@xxxxxxxxx> wrote: >> From: Jerome Glisse <jglisse@xxxxxxxxxx> >> >> Move the page allocation and freeing to driver callback and >> provide ttm code helper function for those. > > btw, having page allocation/freeing in driver callback would solve one > issue that I've been struggling with on ARM.. (well actually, two.. > but one that might not apply to some x86 drivers) > > 1) for buffers mapped uncached/writecombine to userspace, I need a > page that is removed from kernel linear map. ARM requires that all > virtual mappings of a page have the same cache attributes. With GEM > shmem backed buffers, I can't control page allocation. The TTM DMA > pool stuff looks also interesting in this regard, since I can avoid > the need to do cache invalidate operations if I know the previous use > of a page was not for a cached buffer. > > 2) well, in some cases we need pages allocated in low 32bits.. > although this applies to some x86 drivers too, and there is a proposal > of how to handle this issue. > > Previously, since omapdrm is a UMA driver, I'd not looked too closely > at TTM.. but maybe I need to revisit. Is there any good example of a > UMA driver using TTM? (Or even some documentation.. although I'm > happy with a good example.) Radeon IGP chipsets and APUs are UMA. Programming-wise, they still have the concept of vram, but rather than having dedicated vram, they utilize a chunk of contiguous system memory stolen from the top of system memory by the bios at boot time. In addition to "vram" they have a gart mechanism that lets you map cached or uncached system pages into the GPU's address space. Alex > > Just starting to look at the ttm code.. can you map back from 'struct > ttm_tt' to the GEM object? I'm wondering how the populate callback > can know the cache attributes that the buffer is created with.. > (maybe a dumb question, I'm a newbie when it comes to the TTM code..) > > BR, > -R > >> Most intrusive change, is the fact that we now only fully >> populate an object this simplify some of code designed around >> the page fault design. >> >> Signed-off-by: Jerome Glisse <jglisse@xxxxxxxxxx> >> --- >> drivers/gpu/drm/nouveau/nouveau_bo.c | 3 + >> drivers/gpu/drm/radeon/radeon_ttm.c | 2 + >> drivers/gpu/drm/ttm/ttm_bo_util.c | 31 ++++++----- >> drivers/gpu/drm/ttm/ttm_bo_vm.c | 13 ++-- >> drivers/gpu/drm/ttm/ttm_page_alloc.c | 42 ++++++++++++++ >> drivers/gpu/drm/ttm/ttm_tt.c | 97 +++---------------------------- >> drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c | 3 + >> include/drm/ttm/ttm_bo_driver.h | 41 ++++++++------ >> include/drm/ttm/ttm_page_alloc.h | 18 ++++++ >> 9 files changed, 125 insertions(+), 125 deletions(-) >> >> diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c >> index b060fa4..7e5ca3f 100644 >> --- a/drivers/gpu/drm/nouveau/nouveau_bo.c >> +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c >> @@ -28,6 +28,7 @@ >> */ >> >> #include "drmP.h" >> +#include "ttm/ttm_page_alloc.h" >> >> #include "nouveau_drm.h" >> #include "nouveau_drv.h" >> @@ -1050,6 +1051,8 @@ nouveau_bo_fence(struct nouveau_bo *nvbo, struct nouveau_fence *fence) >> >> struct ttm_bo_driver nouveau_bo_driver = { >> .ttm_tt_create = &nouveau_ttm_tt_create, >> + .ttm_tt_populate = &ttm_page_alloc_ttm_tt_populate, >> + .ttm_tt_unpopulate = &ttm_page_alloc_ttm_tt_unpopulate, >> .invalidate_caches = nouveau_bo_invalidate_caches, >> .init_mem_type = nouveau_bo_init_mem_type, >> .evict_flags = nouveau_bo_evict_flags, >> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c >> index 53ff62b..490afce 100644 >> --- a/drivers/gpu/drm/radeon/radeon_ttm.c >> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c >> @@ -584,6 +584,8 @@ struct ttm_tt *radeon_ttm_tt_create(struct ttm_bo_device *bdev, >> >> static struct ttm_bo_driver radeon_bo_driver = { >> .ttm_tt_create = &radeon_ttm_tt_create, >> + .ttm_tt_populate = &ttm_page_alloc_ttm_tt_populate, >> + .ttm_tt_unpopulate = &ttm_page_alloc_ttm_tt_unpopulate, >> .invalidate_caches = &radeon_invalidate_caches, >> .init_mem_type = &radeon_init_mem_type, >> .evict_flags = &radeon_evict_flags, >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c >> index 082fcae..60f204d 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_util.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c >> @@ -244,7 +244,7 @@ static int ttm_copy_io_ttm_page(struct ttm_tt *ttm, void *src, >> unsigned long page, >> pgprot_t prot) >> { >> - struct page *d = ttm_tt_get_page(ttm, page); >> + struct page *d = ttm->pages[page]; >> void *dst; >> >> if (!d) >> @@ -281,7 +281,7 @@ static int ttm_copy_ttm_io_page(struct ttm_tt *ttm, void *dst, >> unsigned long page, >> pgprot_t prot) >> { >> - struct page *s = ttm_tt_get_page(ttm, page); >> + struct page *s = ttm->pages[page]; >> void *src; >> >> if (!s) >> @@ -342,6 +342,12 @@ int ttm_bo_move_memcpy(struct ttm_buffer_object *bo, >> if (old_iomap == NULL && ttm == NULL) >> goto out2; >> >> + if (ttm->state == tt_unpopulated) { >> + ret = ttm->bdev->driver->ttm_tt_populate(ttm); >> + if (ret) >> + goto out1; >> + } >> + >> add = 0; >> dir = 1; >> >> @@ -502,10 +508,16 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo, >> { >> struct ttm_mem_reg *mem = &bo->mem; pgprot_t prot; >> struct ttm_tt *ttm = bo->ttm; >> - struct page *d; >> - int i; >> + int ret; >> >> BUG_ON(!ttm); >> + >> + if (ttm->state == tt_unpopulated) { >> + ret = ttm->bdev->driver->ttm_tt_populate(ttm); >> + if (ret) >> + return ret; >> + } >> + >> if (num_pages == 1 && (mem->placement & TTM_PL_FLAG_CACHED)) { >> /* >> * We're mapping a single page, and the desired >> @@ -513,18 +525,9 @@ static int ttm_bo_kmap_ttm(struct ttm_buffer_object *bo, >> */ >> >> map->bo_kmap_type = ttm_bo_map_kmap; >> - map->page = ttm_tt_get_page(ttm, start_page); >> + map->page = ttm->pages[start_page]; >> map->virtual = kmap(map->page); >> } else { >> - /* >> - * Populate the part we're mapping; >> - */ >> - for (i = start_page; i < start_page + num_pages; ++i) { >> - d = ttm_tt_get_page(ttm, i); >> - if (!d) >> - return -ENOMEM; >> - } >> - >> /* >> * We need to use vmap to get the desired page protection >> * or to make the buffer object look contiguous. >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> index 221b924..bc1d751 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> @@ -174,18 +174,19 @@ static int ttm_bo_vm_fault(struct vm_area_struct *vma, struct vm_fault *vmf) >> vma->vm_page_prot = (bo->mem.placement & TTM_PL_FLAG_CACHED) ? >> vm_get_page_prot(vma->vm_flags) : >> ttm_io_prot(bo->mem.placement, vma->vm_page_prot); >> - } >> >> - /* >> - * Speculatively prefault a number of pages. Only error on >> - * first page. >> - */ >> + /* Allocate all page at once, most common usage */ >> + if (ttm->bdev->driver->ttm_tt_populate(ttm)) { >> + retval = VM_FAULT_OOM; >> + goto out_io_unlock; >> + } >> + } >> >> for (i = 0; i < TTM_BO_VM_NUM_PREFAULT; ++i) { >> if (bo->mem.bus.is_iomem) >> pfn = ((bo->mem.bus.base + bo->mem.bus.offset) >> PAGE_SHIFT) + page_offset; >> else { >> - page = ttm_tt_get_page(ttm, page_offset); >> + page = ttm->pages[page_offset]; >> if (unlikely(!page && i == 0)) { >> retval = VM_FAULT_OOM; >> goto out_io_unlock; >> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c >> index c4f18b9..5b5b1e0 100644 >> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c >> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c >> @@ -852,6 +852,48 @@ void ttm_page_alloc_fini(void) >> _manager = NULL; >> } >> >> +int ttm_page_alloc_ttm_tt_populate(struct ttm_tt *ttm) >> +{ >> + struct ttm_mem_global *mem_glob = ttm->glob->mem_glob; >> + int ret; >> + >> + if (ttm->state != tt_unpopulated) >> + return 0; >> + >> + if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { >> + ret = ttm_tt_swapin(ttm); >> + if (unlikely(ret != 0)) >> + return ret; >> + } >> + >> + ret = ttm_get_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> + ttm->caching_state, ttm->dma_address); >> + if (ret != 0) >> + return -ENOMEM; >> + ret = ttm_mem_global_alloc_pages(mem_glob, ttm->pages, >> + ttm->num_pages, false, false); >> + if (unlikely(ret != 0)) { >> + ttm_put_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> + ttm->caching_state, ttm->dma_address); >> + return -ENOMEM; >> + } >> + >> + ttm->state = tt_unbound; >> + return 0; >> +} >> +EXPORT_SYMBOL(ttm_page_alloc_ttm_tt_populate); >> + >> +void ttm_page_alloc_ttm_tt_unpopulate(struct ttm_tt *ttm) >> +{ >> + struct ttm_mem_global *glob = ttm->glob->mem_glob; >> + >> + ttm_mem_global_free_pages(glob, ttm->pages, ttm->num_pages); >> + ttm_put_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> + ttm->caching_state, ttm->dma_address); >> + ttm->state = tt_unpopulated; >> +} >> +EXPORT_SYMBOL(ttm_page_alloc_ttm_tt_unpopulate); >> + >> int ttm_page_alloc_debugfs(struct seq_file *m, void *data) >> { >> struct ttm_page_pool *p; >> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c >> index b55e132..07e86c4 100644 >> --- a/drivers/gpu/drm/ttm/ttm_tt.c >> +++ b/drivers/gpu/drm/ttm/ttm_tt.c >> @@ -42,8 +42,6 @@ >> #include "ttm/ttm_placement.h" >> #include "ttm/ttm_page_alloc.h" >> >> -static int ttm_tt_swapin(struct ttm_tt *ttm); >> - >> /** >> * Allocates storage for pointers to the pages that back the ttm. >> */ >> @@ -62,75 +60,6 @@ static void ttm_tt_free_page_directory(struct ttm_tt *ttm) >> ttm->dma_address = NULL; >> } >> >> -static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, int index) >> -{ >> - struct page *p; >> - struct ttm_mem_global *mem_glob = ttm->glob->mem_glob; >> - int ret; >> - >> - if (NULL == (p = ttm->pages[index])) { >> - >> - ret = ttm_get_pages(&ttm->pages[index], 1, ttm->page_flags, >> - ttm->caching_state, >> - &ttm->dma_address[index]); >> - if (ret != 0) >> - return NULL; >> - >> - ret = ttm_mem_global_alloc_pages(mem_glob, &ttm->pages[index], >> - 1, false, false); >> - if (unlikely(ret != 0)) >> - goto out_err; >> - } >> - return p; >> -out_err: >> - ttm_put_pages(&ttm->pages[index], 1, ttm->page_flags, >> - ttm->caching_state, &ttm->dma_address[index]); >> - return NULL; >> -} >> - >> -struct page *ttm_tt_get_page(struct ttm_tt *ttm, int index) >> -{ >> - int ret; >> - >> - if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { >> - ret = ttm_tt_swapin(ttm); >> - if (unlikely(ret != 0)) >> - return NULL; >> - } >> - return __ttm_tt_get_page(ttm, index); >> -} >> - >> -int ttm_tt_populate(struct ttm_tt *ttm) >> -{ >> - struct ttm_mem_global *mem_glob = ttm->glob->mem_glob; >> - int ret; >> - >> - if (ttm->state != tt_unpopulated) >> - return 0; >> - >> - if (unlikely(ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) { >> - ret = ttm_tt_swapin(ttm); >> - if (unlikely(ret != 0)) >> - return ret; >> - } >> - >> - ret = ttm_get_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> - ttm->caching_state, ttm->dma_address); >> - if (ret != 0) >> - return -ENOMEM; >> - ret = ttm_mem_global_alloc_pages(mem_glob, ttm->pages, >> - ttm->num_pages, false, false); >> - if (unlikely(ret != 0)) { >> - ttm_put_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> - ttm->caching_state, ttm->dma_address); >> - return -ENOMEM; >> - } >> - >> - ttm->state = tt_unbound; >> - return 0; >> -} >> -EXPORT_SYMBOL(ttm_tt_populate); >> - >> #ifdef CONFIG_X86 >> static inline int ttm_tt_set_page_caching(struct page *p, >> enum ttm_caching_state c_old, >> @@ -232,23 +161,13 @@ int ttm_tt_set_placement_caching(struct ttm_tt *ttm, uint32_t placement) >> } >> EXPORT_SYMBOL(ttm_tt_set_placement_caching); >> >> -static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm) >> -{ >> - struct ttm_mem_global *glob = ttm->glob->mem_glob; >> - >> - ttm_mem_global_free_pages(glob, ttm->pages, ttm->num_pages); >> - ttm_put_pages(ttm->pages, ttm->num_pages, ttm->page_flags, >> - ttm->caching_state, ttm->dma_address); >> - ttm->state = tt_unpopulated; >> -} >> - >> void ttm_tt_destroy(struct ttm_tt *ttm) >> { >> if (unlikely(ttm == NULL)) >> return; >> >> if (likely(ttm->pages != NULL)) { >> - ttm_tt_free_alloced_pages(ttm); >> + ttm->bdev->driver->ttm_tt_unpopulate(ttm); >> ttm_tt_free_page_directory(ttm); >> } >> >> @@ -303,7 +222,7 @@ int ttm_tt_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem) >> if (ttm->state == tt_bound) >> return 0; >> >> - ret = ttm_tt_populate(ttm); >> + ret = ttm->bdev->driver->ttm_tt_populate(ttm); >> if (ret) >> return ret; >> >> @@ -317,7 +236,7 @@ int ttm_tt_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem) >> } >> EXPORT_SYMBOL(ttm_tt_bind); >> >> -static int ttm_tt_swapin(struct ttm_tt *ttm) >> +int ttm_tt_swapin(struct ttm_tt *ttm) >> { >> struct address_space *swap_space; >> struct file *swap_storage; >> @@ -331,6 +250,10 @@ static int ttm_tt_swapin(struct ttm_tt *ttm) >> swap_storage = ttm->swap_storage; >> BUG_ON(swap_storage == NULL); >> >> + ret = ttm->bdev->driver->ttm_tt_populate(ttm); >> + if (ret) >> + return ret; >> + >> swap_space = swap_storage->f_path.dentry->d_inode->i_mapping; >> >> for (i = 0; i < ttm->num_pages; ++i) { >> @@ -339,7 +262,7 @@ static int ttm_tt_swapin(struct ttm_tt *ttm) >> ret = PTR_ERR(from_page); >> goto out_err; >> } >> - to_page = __ttm_tt_get_page(ttm, i); >> + to_page = ttm->pages[i]; >> if (unlikely(to_page == NULL)) >> goto out_err; >> >> @@ -360,7 +283,7 @@ static int ttm_tt_swapin(struct ttm_tt *ttm) >> >> return 0; >> out_err: >> - ttm_tt_free_alloced_pages(ttm); >> + ttm->bdev->driver->ttm_tt_unpopulate(ttm); >> return ret; >> } >> >> @@ -412,7 +335,7 @@ int ttm_tt_swapout(struct ttm_tt *ttm, struct file *persistent_swap_storage) >> page_cache_release(to_page); >> } >> >> - ttm_tt_free_alloced_pages(ttm); >> + ttm->bdev->driver->ttm_tt_unpopulate(ttm); >> ttm->swap_storage = swap_storage; >> ttm->page_flags |= TTM_PAGE_FLAG_SWAPPED; >> if (persistent_swap_storage) >> diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c >> index cc72435..383993b 100644 >> --- a/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c >> +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_buffer.c >> @@ -28,6 +28,7 @@ >> #include "vmwgfx_drv.h" >> #include "ttm/ttm_bo_driver.h" >> #include "ttm/ttm_placement.h" >> +#include "ttm/ttm_page_alloc.h" >> >> static uint32_t vram_placement_flags = TTM_PL_FLAG_VRAM | >> TTM_PL_FLAG_CACHED; >> @@ -334,6 +335,8 @@ static int vmw_sync_obj_wait(void *sync_obj, void *sync_arg, >> >> struct ttm_bo_driver vmw_bo_driver = { >> .ttm_tt_create = &vmw_ttm_tt_create, >> + .ttm_tt_populate = &ttm_page_alloc_ttm_tt_populate, >> + .ttm_tt_unpopulate = &ttm_page_alloc_ttm_tt_unpopulate, >> .invalidate_caches = vmw_invalidate_caches, >> .init_mem_type = vmw_init_mem_type, >> .evict_flags = vmw_evict_flags, >> diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h >> index 6b8c5cd..ae06e42 100644 >> --- a/include/drm/ttm/ttm_bo_driver.h >> +++ b/include/drm/ttm/ttm_bo_driver.h >> @@ -319,6 +319,26 @@ struct ttm_bo_driver { >> struct page *dummy_read_page); >> >> /** >> + * ttm_tt_populate >> + * >> + * @ttm: The struct ttm_tt to contain the backing pages. >> + * >> + * Allocate all backing pages >> + * Returns: >> + * -ENOMEM: Out of memory. >> + */ >> + int (*ttm_tt_populate)(struct ttm_tt *ttm); >> + >> + /** >> + * ttm_tt_unpopulate >> + * >> + * @ttm: The struct ttm_tt to contain the backing pages. >> + * >> + * Free all backing page >> + */ >> + void (*ttm_tt_unpopulate)(struct ttm_tt *ttm); >> + >> + /** >> * struct ttm_bo_driver member invalidate_caches >> * >> * @bdev: the buffer object device. >> @@ -585,15 +605,6 @@ extern int ttm_tt_init(struct ttm_tt *ttm, struct ttm_bo_device *bdev, >> extern int ttm_tt_bind(struct ttm_tt *ttm, struct ttm_mem_reg *bo_mem); >> >> /** >> - * ttm_tt_populate: >> - * >> - * @ttm: The struct ttm_tt to contain the backing pages. >> - * >> - * Add backing pages to all of @ttm >> - */ >> -extern int ttm_tt_populate(struct ttm_tt *ttm); >> - >> -/** >> * ttm_ttm_destroy: >> * >> * @ttm: The struct ttm_tt. >> @@ -612,19 +623,13 @@ extern void ttm_tt_destroy(struct ttm_tt *ttm); >> extern void ttm_tt_unbind(struct ttm_tt *ttm); >> >> /** >> - * ttm_ttm_destroy: >> + * ttm_tt_swapin: >> * >> * @ttm: The struct ttm_tt. >> - * @index: Index of the desired page. >> * >> - * Return a pointer to the struct page backing @ttm at page >> - * index @index. If the page is unpopulated, one will be allocated to >> - * populate that index. >> - * >> - * Returns: >> - * NULL on OOM. >> + * Swap in a previously swap out ttm_tt. >> */ >> -extern struct page *ttm_tt_get_page(struct ttm_tt *ttm, int index); >> +extern int ttm_tt_swapin(struct ttm_tt *ttm); >> >> /** >> * ttm_tt_cache_flush: >> diff --git a/include/drm/ttm/ttm_page_alloc.h b/include/drm/ttm/ttm_page_alloc.h >> index fffb3bd..18a2d6b 100644 >> --- a/include/drm/ttm/ttm_page_alloc.h >> +++ b/include/drm/ttm/ttm_page_alloc.h >> @@ -67,6 +67,24 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages); >> void ttm_page_alloc_fini(void); >> >> /** >> + * ttm_page_alloc_ttm_tt_populate: >> + * >> + * @ttm: The struct ttm_tt to contain the backing pages. >> + * >> + * Add backing pages to all of @ttm >> + */ >> +extern int ttm_page_alloc_ttm_tt_populate(struct ttm_tt *ttm); >> + >> +/** >> + * ttm_page_alloc_ttm_tt_unpopulate: >> + * >> + * @ttm: The struct ttm_tt which to free backing pages. >> + * >> + * Free all pages of @ttm >> + */ >> +extern void ttm_page_alloc_ttm_tt_unpopulate(struct ttm_tt *ttm); >> + >> +/** >> * Output the state of pools to debugfs file >> */ >> extern int ttm_page_alloc_debugfs(struct seq_file *m, void *data); >> -- >> 1.7.7.1 >> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> http://lists.freedesktop.org/mailman/listinfo/dri-devel >> > _______________________________________________ > dri-devel mailing list > dri-devel@xxxxxxxxxxxxxxxxxxxxx > http://lists.freedesktop.org/mailman/listinfo/dri-devel > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel