On Tue, Dec 1, 2020 at 9:32 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote: > > Hi > > Am 30.11.20 um 16:33 schrieb Christian König: > > Am 30.11.20 um 16:30 schrieb Daniel Vetter: > >> On Mon, Nov 30, 2020 at 01:04:26PM +0100, Thomas Zimmermann wrote: > >>> Mapping a GEM object's buffer into kernel address space prevents the > >>> buffer from being evicted from VRAM, which in turn may result in > >>> out-of-memory errors. It's therefore required to only vmap GEM BOs for > >>> short periods of time; unless the GEM implementation provides additional > >>> guarantees. > >>> > >>> Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx> > >>> --- > >>> drivers/gpu/drm/drm_prime.c | 6 ++++++ > >>> include/drm/drm_gem.h | 16 ++++++++++++++++ > >>> 2 files changed, 22 insertions(+) > >>> > >>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > >>> index 7db55fce35d8..9c9ece9833e0 100644 > >>> --- a/drivers/gpu/drm/drm_prime.c > >>> +++ b/drivers/gpu/drm/drm_prime.c > >>> @@ -669,6 +669,12 @@ EXPORT_SYMBOL(drm_gem_unmap_dma_buf); > >>> * callback. Calls into &drm_gem_object_funcs.vmap for device > >>> specific handling. > >>> * The kernel virtual address is returned in map. > >>> * > >>> + * To prevent the GEM object from being relocated, callers must hold > >>> the GEM > >>> + * object's reservation lock from when calling this function until > >>> releasing the > >>> + * mapping. Holding onto a mapping and the associated reservation > >>> lock for an > >>> + * unbound time may result in out-of-memory errors. Calls to > >>> drm_gem_dmabuf_vmap() > >>> + * should therefore be accompanied by a call to > >>> drm_gem_dmabuf_vunmap(). > >>> + * > >>> * Returns 0 on success or a negative errno code otherwise. > >> This is a dma-buf hook, which means just documenting the rules you'd like > >> to have here isn't enough. We need to roll this out at the dma-buf level, > >> and enforce it. > >> > >> Enforce it = assert_lock_held > >> > >> Roll out = review everyone. Because this goes through dma-buf it'll come > >> back through shmem helpers (and other helpers and other subsystems) back > >> to any driver using vmap for gpu buffers. This includes the media > >> subsystem, and the media subsystem definitely doesn't cope with just > >> temporarily mapping buffers. So there we need to pin them, which I think > >> means we'll need 2 version of dma_buf_vmap - one that's temporary and > >> requires we hold dma_resv lock, the other requires that the buffer is > >> pinned. > > > > OR start to proper use the dma_buf_pin/dma_buf_unpin functions which I > > added to cover this use case as well. > > While I generally agree, here are some thoughts: > > I found all generic pin functions useless, because they don't allow for > specifying where to pin. With fbdev emulation, this means that console > buffers might never make it to VRAM for scanout. If anything, the policy > should be that pin always pins in HW-accessible memory. > > Pin has quite a bit of overhead (more locking, buffer movement), so it > should be the second choice after regular vmap. To make both work > together, pin probably relies on holding the reservation lock internally. > > Therefore I think we still would want some additional helpers, such as: > > pin_unlocked(), which acquires the resv lock, calls regular pin and > then drops the resv lock. Same for unpin_unlocked() > > vmap_pinned(), which enforces that the buffer has been pinned and > then calls regalar vmap. Same for vunmap_pinned() > > A typical pattern with these functions would look like this. > > drm_gem_object bo; > dma_buf_map map; > > init() { > pin_unlocked(bo); > vmap_pinned(bo, map); > } > > worker() { > begin_cpu_access() > // access bo via map > end_cpu_access() > } > > fini() { > vunmap_pinned(bo, map); > unpin_unlocked(bo); > } > > init() > while (...) { > worker() > } > fini() > > Is that reasonable for media drivers? So media drivers go through dma-buf, which means we always pin into system memory. Which I guess for vram-only display drivers makes no sense and should be rejected, but we still need somewhat consistent rules. The other thing is that if you do a dma_buf_attach without dynamic mode, dma-buf will pin things for you already. So in many cases it could be that we don't need a separate pin (but since the pin is in the exporter, not dma-buf layer, we can't check for that). I'm also not seeing why existing users need to split up their dma_buf_vmap into a pin + vmap, they don't need them separately. I think we could use what we've done for dynamic dma-buf attachment (which also change locking rules) and just have new functions for the new way (i.e. short term vmap protected by dma_resv lock. Maybe call these dma_buf_vmap_local, in the spirit of the new kmap_local which are currently under discussion. I think _local suffix is better, for otherwise people might do something silly like dma_resv_lock(); dma_buf_vmap_locked(); dma_resv_unlock(); /* actual access maybe even in some other thread */ dma_buf_resv_lock(); dma_buf_vunmap_unlocked(); dma_resv_unlock(); _local suffix is better at telling that the resulting pointer has very limited use (essentially just local to the calling context, if you don't change any locking or anything). I think encouraging importers to call dma_buf_pin/unpin isn't a good idea. Yes dynamic ones need it, but maybe we should check for that somehow in the exporterd interface (atm only amdgpu is using it). -Daniel > Best regards > Thomas > > > > > > Cheers, > > Christian. > > > >> > >> That's what I meant with that this approach here is very sprawling :-/ > >> -Daniel > >> > >>> */ > >>> int drm_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map > >>> *map) > >>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h > >>> index 5e6daa1c982f..7c34cd5ec261 100644 > >>> --- a/include/drm/drm_gem.h > >>> +++ b/include/drm/drm_gem.h > >>> @@ -137,7 +137,21 @@ struct drm_gem_object_funcs { > >>> * Returns a virtual address for the buffer. Used by the > >>> * drm_gem_dmabuf_vmap() helper. > >>> * > >>> + * Notes to implementors: > >>> + * > >>> + * - Implementations must expect pairs of @vmap and @vunmap to be > >>> + * called frequently and should optimize for this case. > >>> + * > >>> + * - Implemenations may expect the caller to hold the GEM object's > >>> + * reservation lock to protect against concurrent calls and > >>> relocation > >>> + * of the GEM object. > >>> + * > >>> + * - Implementations may provide additional guarantees (e.g., > >>> working > >>> + * without holding the reservation lock). > >>> + * > >>> * This callback is optional. > >>> + * > >>> + * See also drm_gem_dmabuf_vmap() > >>> */ > >>> int (*vmap)(struct drm_gem_object *obj, struct dma_buf_map *map); > >>> @@ -148,6 +162,8 @@ struct drm_gem_object_funcs { > >>> * drm_gem_dmabuf_vunmap() helper. > >>> * > >>> * This callback is optional. > >>> + * > >>> + * See also @vmap. > >>> */ > >>> void (*vunmap)(struct drm_gem_object *obj, struct dma_buf_map > >>> *map); > >>> -- > >>> 2.29.2 > >>> > > > > _______________________________________________ > > dri-devel mailing list > > dri-devel@xxxxxxxxxxxxxxxxxxxxx > > https://lists.freedesktop.org/mailman/listinfo/dri-devel > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel