Re: [PATCH 1/8] drm/gem: Write down some rules for vmap usage

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 1, 2020 at 11:27 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
>
> Hi
>
> Am 01.12.20 um 11:00 schrieb Daniel Vetter:
> > On Tue, Dec 1, 2020 at 10:40 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
> >>
> >> Hi
> >>
> >> Am 01.12.20 um 10:10 schrieb Daniel Vetter:
> >>> On Tue, Dec 1, 2020 at 9:32 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote:
> >>>>
> >>>> Hi
> >>>>
> >>>> Am 30.11.20 um 16:33 schrieb Christian König:
> >>>>> Am 30.11.20 um 16:30 schrieb Daniel Vetter:
> >>>>>> On Mon, Nov 30, 2020 at 01:04:26PM +0100, Thomas Zimmermann wrote:
> >>>>>>> Mapping a GEM object's buffer into kernel address space prevents the
> >>>>>>> buffer from being evicted from VRAM, which in turn may result in
> >>>>>>> out-of-memory errors. It's therefore required to only vmap GEM BOs for
> >>>>>>> short periods of time; unless the GEM implementation provides additional
> >>>>>>> guarantees.
> >>>>>>>
> >>>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx>
> >>>>>>> ---
> >>>>>>>     drivers/gpu/drm/drm_prime.c |  6 ++++++
> >>>>>>>     include/drm/drm_gem.h       | 16 ++++++++++++++++
> >>>>>>>     2 files changed, 22 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> >>>>>>> index 7db55fce35d8..9c9ece9833e0 100644
> >>>>>>> --- a/drivers/gpu/drm/drm_prime.c
> >>>>>>> +++ b/drivers/gpu/drm/drm_prime.c
> >>>>>>> @@ -669,6 +669,12 @@ EXPORT_SYMBOL(drm_gem_unmap_dma_buf);
> >>>>>>>      * callback. Calls into &drm_gem_object_funcs.vmap for device
> >>>>>>> specific handling.
> >>>>>>>      * The kernel virtual address is returned in map.
> >>>>>>>      *
> >>>>>>> + * To prevent the GEM object from being relocated, callers must hold
> >>>>>>> the GEM
> >>>>>>> + * object's reservation lock from when calling this function until
> >>>>>>> releasing the
> >>>>>>> + * mapping. Holding onto a mapping and the associated reservation
> >>>>>>> lock for an
> >>>>>>> + * unbound time may result in out-of-memory errors. Calls to
> >>>>>>> drm_gem_dmabuf_vmap()
> >>>>>>> + * should therefore be accompanied by a call to
> >>>>>>> drm_gem_dmabuf_vunmap().
> >>>>>>> + *
> >>>>>>>      * Returns 0 on success or a negative errno code otherwise.
> >>>>>> This is a dma-buf hook, which means just documenting the rules you'd like
> >>>>>> to have here isn't enough. We need to roll this out at the dma-buf level,
> >>>>>> and enforce it.
> >>>>>>
> >>>>>> Enforce it = assert_lock_held
> >>>>>>
> >>>>>> Roll out = review everyone. Because this goes through dma-buf it'll come
> >>>>>> back through shmem helpers (and other helpers and other subsystems) back
> >>>>>> to any driver using vmap for gpu buffers. This includes the media
> >>>>>> subsystem, and the media subsystem definitely doesn't cope with just
> >>>>>> temporarily mapping buffers. So there we need to pin them, which I think
> >>>>>> means we'll need 2 version of dma_buf_vmap - one that's temporary and
> >>>>>> requires we hold dma_resv lock, the other requires that the buffer is
> >>>>>> pinned.
> >>>>>
> >>>>> OR start to proper use the dma_buf_pin/dma_buf_unpin functions which I
> >>>>> added to cover this use case as well.
> >>>>
> >>>> While I generally agree, here are some thoughts:
> >>>>
> >>>> I found all generic pin functions useless, because they don't allow for
> >>>> specifying where to pin. With fbdev emulation, this means that console
> >>>> buffers might never make it to VRAM for scanout. If anything, the policy
> >>>> should be that pin always pins in HW-accessible memory.
> >>>>
> >>>> Pin has quite a bit of overhead (more locking, buffer movement), so it
> >>>> should be the second choice after regular vmap. To make both work
> >>>> together, pin probably relies on holding the reservation lock internally.
> >>>>
> >>>> Therefore I think we still would want some additional helpers, such as:
> >>>>
> >>>>      pin_unlocked(), which acquires the resv lock, calls regular pin and
> >>>> then drops the resv lock. Same for unpin_unlocked()
> >>>>
> >>>>      vmap_pinned(), which enforces that the buffer has been pinned and
> >>>> then calls regalar vmap. Same for vunmap_pinned()
> >>>>
> >>>> A typical pattern with these functions would look like this.
> >>>>
> >>>>           drm_gem_object bo;
> >>>>           dma_buf_map map;
> >>>>
> >>>>           init() {
> >>>>                   pin_unlocked(bo);
> >>>>                   vmap_pinned(bo, map);
> >>>>           }
> >>>>
> >>>>           worker() {
> >>>>                   begin_cpu_access()
> >>>>                   // access bo via map
> >>>>                   end_cpu_access()
> >>>>           }
> >>>>
> >>>>           fini() {
> >>>>                   vunmap_pinned(bo, map);
> >>>>                   unpin_unlocked(bo);
> >>>>           }
> >>>>
> >>>>           init()
> >>>>           while (...) {
> >>>>                   worker()
> >>>>           }
> >>>>           fini()
> >>>>
> >>>> Is that reasonable for media drivers?
> >>>
> >>> So media drivers go through dma-buf, which means we always pin into
> >>> system memory. Which I guess for vram-only display drivers makes no
> >>> sense and should be rejected, but we still need somewhat consistent
> >>> rules.
> >>>
> >>> The other thing is that if you do a dma_buf_attach without dynamic
> >>> mode, dma-buf will pin things for you already. So in many cases it
> >>
> >> Do you have a pointer to code that illustrates this well?
> >>
> >>> could be that we don't need a separate pin (but since the pin is in
> >>> the exporter, not dma-buf layer, we can't check for that). I'm also
> >>> not seeing why existing users need to split up their dma_buf_vmap into
> >>> a pin + vmap, they don't need them separately.
> >>
> >> It's two different operations, with pin having some possible overhead
> >> and failure conditions. And during the last discussion, pin was say to
> >> be for userspace-managed buffers. Kernel code should hold the
> >> reservation lock.
> >>
> >> For my POV, the current interfaces have no clear policy or semantics.
> >> Looking through the different GEM implementations, each one seems to
> >> have its own interpretation.
> >
> > Yup, that's the problem really. In the past we've had vmap exclusively
> > for permanently pinned access, with no locking requirements. Now we're
> > trying to make this more dynamic, but in a somewhat ad-hoc fashion
> > (generic fbdev emulation charged ahead with making the fbdev
> > framebuffer evictable), and now it breaks at every seam. Adding more
> > ad-hoc semantics on top doesn't seem like a good idea.
> >
> > That's why I think we should have 2 different interfaces:
> > - dma_buf_vmap, the one we currently have. Permanently pins the
> > buffer, mapping survives, no locking required.
> > - dma_buf_vmap_local, the new interface, the one that generic fbdev
> > should have used (but oh well mistakes happen), requires
> > dma_resv_lock, the mapping is only local to the caller
>
> In patch 6 of this series, there's ast cursor code that acquires two
> BO's reservation locks and vmaps them afterwards. That's probably how
> you intend to use dma_buf_vmap_local.
>
> However, I think it's more logically to have a vmap callback that only
> does the actual vmap. This is all that exporters would have to implement.
>
> And with that, build one helper that pins before vmap and one helper
> that gets the resv lock.
>
> I know that it might require some work on exporting drivers. But with
> your proposal, we probably need another dma-buf callback just for
> vmap_local. (?) That seems even less appealing to me.

The stuff I was explaining is what I expect the interface to look like
for the importers.

For exporters I think there's going to be two modes, and there I think
a single vmap plus separate (optional) pin as needed sounds
reasonable. Again that's pretty much what we've done for dynamic
exporters too. So sounds all good to me.

Btw I don't think we need a dma_buf_vmap_local_unlocked helper, imo
better to inflict the dma_resv_lock onto callers explicitly. It's a
bit more code, but the dma_resv is a very tricky lock which is highly
constrained. Surfacing it instead of hiding it feels better. In
general best practice is that the dma_resv is locked by the top most
caller possible, and lower functions and callbacks just have an
dma_resv_assert_locked.
-Daniel
>
> Best regards
> Thomas
>
> >
> > Trying to shovel both semantics into one interface, depending upon
> > which implementation we have backing the buffer, doesn't work indeed.
> >
> > Also on the pin topic, I think neither interface should require
> > callers to explicitly pin anything. For existing users it should
> > happen automatically behind the scenes imo, that's what they're
> > expecting.
> > -Daniel
> >
> >
> >>> I think we could use what we've done for dynamic dma-buf attachment
> >>> (which also change locking rules) and just have new functions for the
> >>> new way (i.e. short term vmap protected by dma_resv lock. Maybe call
> >>> these dma_buf_vmap_local, in the spirit of the new kmap_local which
> >>> are currently under discussion. I think _local suffix is better, for
> >>> otherwise people might do something silly like
> >>>
> >>>       dma_resv_lock();
> >>>       dma_buf_vmap_locked();
> >>>       dma_resv_unlock();
> >>>
> >>>       /* actual access maybe even in some other thread */
> >>>
> >>>      dma_buf_resv_lock();
> >>>      dma_buf_vunmap_unlocked();
> >>>      dma_resv_unlock();
> >>>
> >>> _local suffix is better at telling that the resulting pointer has very
> >>> limited use (essentially just local to the calling context, if you
> >>> don't change any locking or anything).
> >>
> >> _local sounds good.
> >>
> >> Best regards
> >> Thomas
> >>
> >>>
> >>> I think encouraging importers to call dma_buf_pin/unpin isn't a good
> >>> idea. Yes dynamic ones need it, but maybe we should check for that
> >>> somehow in the exporterd interface (atm only amdgpu is using it).
> >>> -Daniel
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>> Best regards
> >>>> Thomas
> >>>>
> >>>>
> >>>>>
> >>>>> Cheers,
> >>>>> Christian.
> >>>>>
> >>>>>>
> >>>>>> That's what I meant with that this approach here is very sprawling :-/
> >>>>>> -Daniel
> >>>>>>
> >>>>>>>      */
> >>>>>>>     int drm_gem_dmabuf_vmap(struct dma_buf *dma_buf, struct dma_buf_map
> >>>>>>> *map)
> >>>>>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> >>>>>>> index 5e6daa1c982f..7c34cd5ec261 100644
> >>>>>>> --- a/include/drm/drm_gem.h
> >>>>>>> +++ b/include/drm/drm_gem.h
> >>>>>>> @@ -137,7 +137,21 @@ struct drm_gem_object_funcs {
> >>>>>>>          * Returns a virtual address for the buffer. Used by the
> >>>>>>>          * drm_gem_dmabuf_vmap() helper.
> >>>>>>>          *
> >>>>>>> +     * Notes to implementors:
> >>>>>>> +     *
> >>>>>>> +     * - Implementations must expect pairs of @vmap and @vunmap to be
> >>>>>>> +     *   called frequently and should optimize for this case.
> >>>>>>> +     *
> >>>>>>> +     * - Implemenations may expect the caller to hold the GEM object's
> >>>>>>> +     *   reservation lock to protect against concurrent calls and
> >>>>>>> relocation
> >>>>>>> +     *   of the GEM object.
> >>>>>>> +     *
> >>>>>>> +     * - Implementations may provide additional guarantees (e.g.,
> >>>>>>> working
> >>>>>>> +     *   without holding the reservation lock).
> >>>>>>> +     *
> >>>>>>>          * This callback is optional.
> >>>>>>> +     *
> >>>>>>> +     * See also drm_gem_dmabuf_vmap()
> >>>>>>>          */
> >>>>>>>         int (*vmap)(struct drm_gem_object *obj, struct dma_buf_map *map);
> >>>>>>> @@ -148,6 +162,8 @@ struct drm_gem_object_funcs {
> >>>>>>>          * drm_gem_dmabuf_vunmap() helper.
> >>>>>>>          *
> >>>>>>>          * This callback is optional.
> >>>>>>> +     *
> >>>>>>> +     * See also @vmap.
> >>>>>>>          */
> >>>>>>>         void (*vunmap)(struct drm_gem_object *obj, struct dma_buf_map
> >>>>>>> *map);
> >>>>>>> --
> >>>>>>> 2.29.2
> >>>>>>>
> >>>>>
> >>>>> _______________________________________________
> >>>>> dri-devel mailing list
> >>>>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
> >>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>>>
> >>>> --
> >>>> Thomas Zimmermann
> >>>> Graphics Driver Developer
> >>>> SUSE Software Solutions Germany GmbH
> >>>> Maxfeldstr. 5, 90409 Nürnberg, Germany
> >>>> (HRB 36809, AG Nürnberg)
> >>>> Geschäftsführer: Felix Imendörffer
> >>>>
> >>>
> >>>
> >>
> >> --
> >> Thomas Zimmermann
> >> Graphics Driver Developer
> >> SUSE Software Solutions Germany GmbH
> >> Maxfeldstr. 5, 90409 Nürnberg, Germany
> >> (HRB 36809, AG Nürnberg)
> >> Geschäftsführer: Felix Imendörffer
> >>
> >
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux