On Wed, Dec 19, 2012 at 09:22:26AM +0000, Chris Wilson wrote: > On Wed, 19 Dec 2012 11:56:18 +1000, Dave Airlie <airlied@xxxxxxxxx> wrote: > > From: Dave Airlie <airlied@xxxxxxxxxx> > > > > So we have to offset manager implementations for dealing with VMA offsets. > > GEM had one using the hash table, TTM had one using an rbtree, > > > > I'd rather we just had one to rule them all, since ttm is using the rbtree > > variant to allow sub mappings, to keep ABI we need to use that one. > > > > This also adds a bunch of inline helpers to avoid gem/ttm poking around > > inside the vma_offset objects. TTM needs a reset helper to remove a vma_offset > > when it copies an object to another one for buffer moves. The helpers > > also let drivers avoid the map_list.hash_key << PAGE_SHIFT nonsense. > > Any clue as to the performance difference between the two > implementations? What does it add to the cost of a pagefault? > > Hmm, don't have an i-g-t handy for scalability testing of the fault > handlers. > > > +int drm_vma_offset_setup(struct drm_vma_offset_manager *man, > > + struct drm_vma_offset_node *node, > > + unsigned long num_pages) > > +{ > > + int ret; > > + > > +retry_pre_get: > > + ret = drm_mm_pre_get(&man->addr_space_mm); > > + if (unlikely(ret != 0)) > > + return ret; > > + > > + write_lock(&man->vm_lock); > > + node->vm_node = drm_mm_search_free(&man->addr_space_mm, > > + num_pages, 0, 0); > > + > > + if (unlikely(node->vm_node == NULL)) { > > + ret = -ENOMEM; > ret = -ENOSPC; > > Depended upon by the higher layers for determining when to purge their > caches; i-g-t/gem_mmap_offset_exhaustion The larger topic is that drm_mm is only 32bit on 32bit and we routinely exhaust that after a few weeks of uptime. Or better: We _did_ exhaust that, until we've added tons of checks in both kernel&libdrm to reap cached objects if it doesn't work. Hence it's paramount for our code to get a -ENOSPC to engage in mmap offset reaping. > > + goto out_unlock; > > + } > > + > > + node->vm_node = drm_mm_get_block_atomic(node->vm_node, > > + num_pages, 0); > > + > > I'd feel happier if this tried to only take from the preallocated pool > rather than actually try a GFP_ATOMIC allocation. Seconded. Imo drm_mm_pre_get should just die - it artificially limits prealloc to 4 nodes, which means you'll fall over if 5 threads enter this. And with the drm_mm rework of about a year ago it's no longer required, you can simply embedded the struct drm_mm_node whereever you want, and mm won't allocate anything any more on top of that. Or preallocate the drm_mm_node in the code, outside of the locks. Inserting the node happens then with drm_mm_insert* functions, which combine the search_free + get_blcok (since really, there's nothing interesting you can do with the hole space you get from search_free). See e.g. http://cgit.freedesktop.org/~danvet/drm-intel/commit/?id=dc9dd7a20fde95aa81a8307cde79c2dff9f83f3d for the conversion. My efforts of yonder to convert everyon have stalled a bit in the ttm code, but now that i915 is converted and (hopefully) the mmap offset stuff, I'll pick this up again. Would allow us to kill quite a bit of cruft from the drm_mm interfaces. Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel