Re: [PATCH 0/1][RFC] drm/ttm Improved io_mem_reserve / io_mem_free_calling

Thomas Hellstrom <thellstrom@xxxxxxxxxx> · Thu, 11 Nov 2010 17:50:28 +0100

On 11/11/2010 04:27 PM, Jerome Glisse wrote:
On Thu, Nov 11, 2010 at 3:41 AM, Thomas Hellstrom<thellstrom@xxxxxxxxxx>  wrote:

The following patch is really intended for the next merge window.

RFC:
1) Are there any implementations of driver::io_mem_reserve today that can't use
the fastpath?
2) Can we put an atomic requirement on driver::io_mem_reserve /
driver::io_mem_free?

The patch improves on the io_mem_reserve / io_mem_free calling sequences by
introducing a fastpath and an optional eviction mechanism.

The fastpath is enabled by default and is switched off by the driver setting
struct ttm_mem_type_manager::io_reserve_fastpath to false on mem type init.
With the fastpath no locking occurs, and io_mem_free is never called.
I'm not sure if there are any implementations today that could not use the
fastpath.

As mentioned in the patch, if the fastpath is disabled, calls to
io_mem_reserve and io_mem_free are exactly balanced, and refcounted within
struct ttm_mem_reg so that io_mem_reserve should never be called recursively
for the same struct ttm_mem_reg.
Locking is required to make sure that ptes are never present on when the
underlying memory region is not reserved. Currently I'm using
man::io_reserve_mutex for this. Can we use a spinlock? That would require
io_mem_reserve and io_mem_free to be atomic.

Optionally, there is an eviction mechanism that is activated by setting
struct ttm_mem_type_manager::use_io_reserve_lru to true when initialized.
If the eviction mechanism is activated, and io_mem_reserve returns -EAGAIN,
it will attempt to kill user-space mappings to free up reserved regions.
Kernel mappings (ttm_bo_kmap) are not affected.

Radeon can use fast path, i think nouveau can too. I am not sure we
can consider io_mem_reserve as atomic. Use case i fear is GPU with
remappable apperture i don't know what kind of code we would need for
that and it might sleep. Thought my first guess is that it likely can
be done atomicly.

In that case, I think I will change it to a spinlock, with a code 
comment that it can be changed to a mutex later if it turns out to be 
very hard / impossible to implement atomic operations. Another possible 
concern is the execution of umap_mapping_range() that may in some cases 
be long. Perhaps too long to use a spinlock.

Quick review of the patch looks good, i will try to take a closer look latter.

Cheers,
Jerome Glisse

Great. Thanks,
Thomas

_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel