Re: [PATCH 0/1][RFC] drm/ttm Improved io_mem_reserve / io_mem_free_calling

Thomas Hellstrom <thellstrom@xxxxxxxxxx> · Fri, 12 Nov 2010 08:18:33 +0100

On 11/11/2010 11:46 PM, Ben Skeggs wrote:
On Thu, 2010-11-11 at 17:50 +0100, Thomas Hellstrom wrote:

On 11/11/2010 04:27 PM, Jerome Glisse wrote:

On Thu, Nov 11, 2010 at 3:41 AM, Thomas Hellstrom<thellstrom@xxxxxxxxxx>   wrote:

The following patch is really intended for the next merge window.

RFC:
1) Are there any implementations of driver::io_mem_reserve today that can't use
the fastpath?
2) Can we put an atomic requirement on driver::io_mem_reserve /
driver::io_mem_free?

The patch improves on the io_mem_reserve / io_mem_free calling sequences by
introducing a fastpath and an optional eviction mechanism.

The fastpath is enabled by default and is switched off by the driver setting
struct ttm_mem_type_manager::io_reserve_fastpath to false on mem type init.
With the fastpath no locking occurs, and io_mem_free is never called.
I'm not sure if there are any implementations today that could not use the
fastpath.

As mentioned in the patch, if the fastpath is disabled, calls to
io_mem_reserve and io_mem_free are exactly balanced, and refcounted within
struct ttm_mem_reg so that io_mem_reserve should never be called recursively
for the same struct ttm_mem_reg.
Locking is required to make sure that ptes are never present on when the
underlying memory region is not reserved. Currently I'm using
man::io_reserve_mutex for this. Can we use a spinlock? That would require
io_mem_reserve and io_mem_free to be atomic.

Optionally, there is an eviction mechanism that is activated by setting
struct ttm_mem_type_manager::use_io_reserve_lru to true when initialized.
If the eviction mechanism is activated, and io_mem_reserve returns -EAGAIN,
it will attempt to kill user-space mappings to free up reserved regions.
Kernel mappings (ttm_bo_kmap) are not affected.

Radeon can use fast path, i think nouveau can too. I am not sure we
can consider io_mem_reserve as atomic. Use case i fear is GPU with
remappable apperture i don't know what kind of code we would need for
that and it might sleep. Thought my first guess is that it likely can
be done atomicly.

In that case, I think I will change it to a spinlock, with a code
comment that it can be changed to a mutex later if it turns out to be
very hard / impossible to implement atomic operations. Another possible
concern is the execution of umap_mapping_range() that may in some cases
be long. Perhaps too long to use a spinlock.

I'd rather keep the mutex personally, the code I have in development
uses mutexes itself beyond the io_mem_reserve/io_mem_free calls.  An
earlier revision used spinlocks, but it wasn't very nice.

Ben.

OK.
Note that any per-mem-type shared objects accessed by io_mem_reserve / 
io_mem_free don't need any further protection beyond the lock we're 
discussing. For the same mem_type, io_mem_reserve / io_mem_free will be 
completely serialized with this patch.

Anyway, let's keep the mutex.

/Thomas

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel