On Wed, Dec 9, 2020 at 10:32 AM Thomas Zimmermann <tzimmermann@xxxxxxx> wrote: > > Hi > > Am 09.12.20 um 01:13 schrieb Daniel Vetter: > > On Fri, Dec 04, 2020 at 09:47:08AM +0100, Christian König wrote: > >> Am 04.12.20 um 09:32 schrieb Thomas Zimmermann: > >>> Hi > >>> > >>> Am 03.12.20 um 21:41 schrieb Daniel Vetter: > >>>> On Thu, Dec 03, 2020 at 07:59:04PM +0100, Thomas Zimmermann wrote: > >>>>> Hi > >>>>> > >>>>> Am 03.12.20 um 16:26 schrieb Daniel Vetter: > >>>>>> On Thu, Dec 03, 2020 at 03:02:59PM +0100, Thomas Zimmermann wrote: > >>>>>>> Dma-buf's vmap and vunmap callbacks are undocumented and various > >>>>>>> exporters currently have slightly different semantics for them. Add > >>>>>>> documentation on how to implement and use these interfaces correctly. > >>>>>>> > >>>>>>> v2: > >>>>>>> * document vmap semantics in struct dma_buf_ops > >>>>>>> * add TODO item for reviewing and maybe fixing dma-buf exporters > >>>>>>> > >>>>>>> Signed-off-by: Thomas Zimmermann <tzimmermann@xxxxxxx> > >>>>>> > >>>>>> I still don't think this works, we're breaking dma_buf_vmap > >>>>>> for everyone > >>>>>> else here. > >>>>> > >>>>> I removed the text on the importer. These notes for callers in > >>>>> the docs are > >>>>> more or less a consequence of the exporter semantics. > >>>> > >>>> Callers are importers, so I'm not seeing how that fixes anything. > >>>> > >>>>> I thought we at least agreed on the exporter semantics in the > >>>>> other thread, > >>>>> didn't we? > >>>>> > >>>>> What I'm trying to do is to write dome some rules for exporters, > >>>>> even if not > >>>>> all exporters follow them. > >>>> > >>>> This is a standard interface, everyone needs to follow the same > >>>> rules. And > >>>> if they change, we need to make sure nothing breaks and we're not > >>>> creating > >>>> issues. > >>>> > >>>> In the past the rule was the dma_buf_vmap was allowed to take the > >>>> dma_resv_lock, and that the buffer should be pinned. Now some ttm > >>>> drivers > >>>> didn't ever bother with the pinning, and mostly got away with that > >>>> because > >>>> drm_prime helpers do the pinning by default at attach time, and most > >>>> users > >>>> do call dma_buf_attach. > >>>> > >>>> But if you look through dma-buf docs nothing ever said you have to do a > >>>> dummy attachment before you call dma_buf_vmap, that's just slightly > >>>> crappy > >>>> implementations that didn't blow up yet. > >>> > >>> I had a patch for adding pin to radeon's implementation of vmap. [1] > >>> Christian told me to not do this; instead just get the lock in the fbdev > >>> code. His advise almost seems the opposite of what you're telling me > >>> here. > >> > >> I think what Daniel suggests here is that we need to smoothly transition the > >> code from making assumptions to having a straight interface where importers > >> explicitly say when stuff is locked and when stuff is pinned. > >> > >> I've started this with the attach interface by adding a new dynamic approach > >> to that, but you probably need to carry on here with that for vmap as well. > >> > >> When that is done we can migrate every exporter over to the new dynamic > >> approach. > >> > >>> > >>> For the GEM VRAM helpers, that implicit pin in vmap gave me headaches. > >>> Because scanouts can only be done from VRAM, which is badly suited for > >>> exporting. So I ended up with an implicit pin that pins the buffer to > >>> whatever domain it currently is. I got away with it because GEM VRAM BOs > >>> are not sharable among devices; fbdev is the only user of that > >>> functionality and only pins for short periods of time. > >>> > >>> I suspect that fixing TTM-based drivers by adding an implicit pin would > >>> result in a similar situation. Whatever domain it ends up pinning, some > >>> functionality might not be compatible with that. > >> > >> Correct, exactly that's the problem. > >> > >>> > >>>> > >>>>> Given the circumstances, we should leave out this patch from the > >>>>> patchset. > >>>> > >>>> So the defacto rules are already a big mess, but that's not a good > >>>> excuse > >>>> to make it worse. > >>>> > >>>> What I had in mind is that we split dma_buf_vmap up into two variants: > >>>> > >>>> - The current one, which should guarantee that the buffer is pinned. > >>>> Because that's what all current callers wanted, before the fbdev code > >>>> started allowing non-pinned buffers. > >>> > >>> Can we add an explicit pin operation to dma_buf_vmap() to enforce the > >>> semantics? > >> > >> At least I would be fine with that. For now amdgpu is the only exporter who > >> implements the explicit pin/unpin semantics anyway. > > > > Yup, I think that makes sense (if it works). Maybe we could use something > > like: > > > > a) dma_buf pin exists, driver is dynamic. This means dma_buf_vmap needs to > > first pin, then call ->vmap. dma_buf_vmap_local otoh can directly call > > ->vmap since the exporter relies on either a pin or dma_resv_lock. > > > > b) dma_buf pin not implement, driver is a legacy pile. dma_buf_vmap will > > pin (somewhere at least, or rely on some implicit pin), dma_buf_vmap_local > > doesn't work and should fail. > > I think I read in the dma-buf documentation that pin is supposed to put > the BO in a domain that is suitable for scanout. Now I don't really > trust this to work. Amdgpu, radeon and nouveau put it into the GTT > region. Qxl appears to put it wherever it is. Uh that sounds wrong, or at least not the full complexity. ->pin's main use right now is to freeze the dma-buf into system memory when there's a non-dynamic attachement for a dynamic exporter. There's also a dma_buf_pin call in amdgpu, and I think that only works for scanout on integrated gpu. Pinning to vram would break sharing for all existing dma-buf users. Christian can you perhaps explain when amdgpu uses dma_buf_pin()? My take is the ->pin callback should clarify that it should pin into system memory, for legacy (non-dynamic) dma-buf users. And dma_buf_pin() should gain some kerneldoc which then states that "This should only be used in limited use cases like scanout and not for temporary pin operations." I think the problem with this kerneldoc is simply that it tries to document the exporter and importer side of the contract in one place and makes a mess of it, plus leaves the actual importer side function undocumented. I think the kerneldoc also predates the final patch version, and we didn't update it fully. > > I think for less transition work fbdev helpers could first try > > dma_resv_lock + dma_buf_vmap_local, if that fails, drop the dma_resv_lock > > and do the pinning dma_buf_vmap. That way we don't have to convert shmem > > helpers over to dma_resv locking, which should help. > > I have meanwhile made a patchset that updates helpers for cma, shmem and > vram with vmap_local; and converts fbdev emulation as well. It needs a > bit more testing before being posted. Nice, even better! -Daniel > Best regards > Thomas > > > > > And ttm drivers would do the new clean interface, so at least everyone > > using dma_resv today is all fine. Intel's conversion to dma_resv lock is > > in-flight, but that needs a conversion to the dynamic interface anyway, > > the current code splats. And dynamic brings means explicit pin/unpin > > callbacks, so should be good too. > > -Daniel > > > >> > >> Regards, > >> Christian. > >> > >>> > >>> Best regards > >>> Thomas > >>> > >>> [1] https://patchwork.freedesktop.org/patch/400054/?series=83765&rev=1 > >>> > >>>> > >>>> - The new one, which allows vmapping with just dma_resv locked, and > >>>> should > >>>> have some caching in exporters. > >>>> > >>>> Breaking code and then adding todos about that is kinda not so cool > >>>> approach here imo. > >>>> > >>>> Also I guess ttm_bo_vmap should have a check that either the buffer is > >>>> pinned, or dma_resv_lock is held. > >>>> > >>>> Cheers, Daniel > >>>> > >>>> > >>>> > >>>>> > >>>>> Best regards > >>>>> Thomas > >>>>> > >>>>>> > >>>>>>> --- > >>>>>>> Documentation/gpu/todo.rst | 15 +++++++++++++ > >>>>>>> include/drm/drm_gem.h | 4 ++++ > >>>>>>> include/linux/dma-buf.h | 45 > >>>>>>> ++++++++++++++++++++++++++++++++++++++ > >>>>>>> 3 files changed, 64 insertions(+) > >>>>>>> > >>>>>>> diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst > >>>>>>> index 009d8e6c7e3c..32bb797a84fc 100644 > >>>>>>> --- a/Documentation/gpu/todo.rst > >>>>>>> +++ b/Documentation/gpu/todo.rst > >>>>>>> @@ -505,6 +505,21 @@ Contact: Thomas Zimmermann > >>>>>>> <tzimmermann@xxxxxxx>, Christian König, Daniel Vette > >>>>>>> Level: Intermediate > >>>>>>> +Enforce rules for dma-buf vmap and pin ops > >>>>>>> +------------------------------------------ > >>>>>>> + > >>>>>>> +Exporter implementations of vmap and pin in struct > >>>>>>> dma_buf_ops (and consequently > >>>>>>> +struct drm_gem_object_funcs) use a variety of locking > >>>>>>> semantics. Some rely on > >>>>>>> +the caller holding the dma-buf's reservation lock, some > >>>>>>> do their own locking, > >>>>>>> +some don't require any locking. VRAM helpers even used > >>>>>>> to pin as part of vmap. > >>>>>>> + > >>>>>>> +We need to review each exporter and enforce the documented rules. > >>>>>>> + > >>>>>>> +Contact: Christian König, Daniel Vetter, Thomas > >>>>>>> Zimmermann <tzimmermann@xxxxxxx> > >>>>>>> + > >>>>>>> +Level: Advanced > >>>>>>> + > >>>>>>> + > >>>>>>> Core refactorings > >>>>>>> ================= > >>>>>>> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h > >>>>>>> index 5e6daa1c982f..1864c6a721b1 100644 > >>>>>>> --- a/include/drm/drm_gem.h > >>>>>>> +++ b/include/drm/drm_gem.h > >>>>>>> @@ -138,6 +138,8 @@ struct drm_gem_object_funcs { > >>>>>>> * drm_gem_dmabuf_vmap() helper. > >>>>>>> * > >>>>>>> * This callback is optional. > >>>>>>> + * > >>>>>>> + * See also struct dma_buf_ops.vmap > >>>>>>> */ > >>>>>>> int (*vmap)(struct drm_gem_object *obj, struct > >>>>>>> dma_buf_map *map); > >>>>>>> @@ -148,6 +150,8 @@ struct drm_gem_object_funcs { > >>>>>>> * drm_gem_dmabuf_vunmap() helper. > >>>>>>> * > >>>>>>> * This callback is optional. > >>>>>>> + * > >>>>>>> + * See also struct dma_buf_ops.vunmap > >>>>>>> */ > >>>>>>> void (*vunmap)(struct drm_gem_object *obj, struct > >>>>>>> dma_buf_map *map); > >>>>>>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > >>>>>>> index cf72699cb2bc..dc81fdc01dda 100644 > >>>>>>> --- a/include/linux/dma-buf.h > >>>>>>> +++ b/include/linux/dma-buf.h > >>>>>>> @@ -267,7 +267,52 @@ struct dma_buf_ops { > >>>>>>> */ > >>>>>>> int (*mmap)(struct dma_buf *, struct vm_area_struct *vma); > >>>>>>> + /** > >>>>>>> + * @vmap: > >>>>>> > >>>>>> There's already a @vmap and @vunamp kerneldoc at the top comment, that > >>>>>> needs to be removed. > >>>>>> -Daniel > >>>>>> > >>>>>>> + * > >>>>>>> + * Returns a virtual address for the buffer. > >>>>>>> + * > >>>>>>> + * Notes to callers: > >>>>>>> + * > >>>>>>> + * - Callers must hold the struct dma_buf.resv lock > >>>>>>> before calling > >>>>>>> + * this interface. > >>>>>>> + * > >>>>>>> + * - Callers must provide means to prevent the > >>>>>>> mappings from going > >>>>>>> + * stale, such as holding the reservation lock or providing a > >>>>>>> + * move-notify callback to the exporter. > >>>>>>> + * > >>>>>>> + * Notes to implementors: > >>>>>>> + * > >>>>>>> + * - Implementations must expect pairs of @vmap and > >>>>>>> @vunmap to be > >>>>>>> + * called frequently and should optimize for this case. > >>>>>>> + * > >>>>>>> + * - Implementations should avoid additional operations, such as > >>>>>>> + * pinning. > >>>>>>> + * > >>>>>>> + * - Implementations may expect the caller to hold the dma-buf's > >>>>>>> + * reservation lock to protect against concurrent calls and > >>>>>>> + * relocation. > >>>>>>> + * > >>>>>>> + * - Implementations may provide additional > >>>>>>> guarantees, such as working > >>>>>>> + * without holding the reservation lock. > >>>>>>> + * > >>>>>>> + * This callback is optional. > >>>>>>> + * > >>>>>>> + * Returns: > >>>>>>> + * > >>>>>>> + * 0 on success or a negative error code on failure. > >>>>>>> + */ > >>>>>>> int (*vmap)(struct dma_buf *dmabuf, struct dma_buf_map *map); > >>>>>>> + > >>>>>>> + /** > >>>>>>> + * @vunmap: > >>>>>>> + * > >>>>>>> + * Releases the address previously returned by @vmap. > >>>>>>> + * > >>>>>>> + * This callback is optional. > >>>>>>> + * > >>>>>>> + * See also @vmap() > >>>>>>> + */ > >>>>>>> void (*vunmap)(struct dma_buf *dmabuf, struct > >>>>>>> dma_buf_map *map); > >>>>>>> }; > >>>>>>> -- > >>>>>>> 2.29.2 > >>>>>>> > >>>>>> > >>>>> > >>>>> -- > >>>>> Thomas Zimmermann > >>>>> Graphics Driver Developer > >>>>> SUSE Software Solutions Germany GmbH > >>>>> Maxfeldstr. 5, 90409 Nürnberg, Germany > >>>>> (HRB 36809, AG Nürnberg) > >>>>> Geschäftsführer: Felix Imendörffer > >>>>> > >>>> > >>>> > >>>> > >>>> > >>> > >> > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch