> The waiting done here is only for the shared fence to switch from explicitly to > implicitly synchronization for correct interaction with the Intel driver. Actually here it is going to wait for all fences, 94 ret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true... > ttm_bo_validate() is an asynchronous operation to enable GPU access to the > BO, it isn't related at all to possible CPU access and can actually prevent it a > number of cases. > > amdgpu_ttm_bind() in turn binds the BO to the GART space and isn't related > to CPU access either. ttm_bo_validate() can move buffer is necessary, and amdgpu_ttm_bind() will flush hdp, which we have discussed before. > What you most likely need to do here is to use > reservation_object_wait_timeout_rcu() to wait for all GPU operations to end. Right, that is called by amdgpu_gem_prime_pin(). Sam > -----Original Message----- > From: Christian König [mailto:ckoenig.leichtzumerken at gmail.com] > Sent: Wednesday, September 20, 2017 2:15 PM > To: Li, Samuel <Samuel.Li at amd.com>; Koenig, Christian > <Christian.Koenig at amd.com>; amd-gfx at lists.freedesktop.org > Subject: Re: [PATCH v2 1/1] drm/amdgpu: Add gem_prime_mmap support > > Am 20.09.2017 um 19:47 schrieb Li, Samuel: > >> No that isn't correct. Pinning just notes that the BO shouldn't be > >> moved any more. It doesn't wait for anything. > > It does. The following is from amdgpu_gem_prime_pin(), > > 91 * Wait for all shared fences to complete before we switch to future > > 92 * use of exclusive fence on this prime shared bo. > > 93 */ > > 94 ret = reservation_object_wait_timeout_rcu(bo->tbo.resv, true, > false, > > 95 MAX_SCHEDULE_TIMEOUT); > > 96 if (unlikely(ret < 0)) { > > 97 DRM_DEBUG_PRIME("Fence wait failed: %li\n", ret); > > 98 amdgpu_bo_unreserve(bo); > > 99 return ret; > > 100 } > > > > Besides, pinning process prepares all the stuff before and after > > moving buffer(ttm_bo_validate, amdgpu_ttm_bind), > > No, you misunderstood what this is good for. > > The waiting done here is only for the shared fence to switch from explicitly to > implicitly synchronization for correct interaction with the Intel driver. > > As soon the the BO is exported that shouldn't wait for anything any more. > > > I think if a buffer can be moved, it is probably also in a good condition to be > accessed. > > That is complete nonsense. > > ttm_bo_validate() is an asynchronous operation to enable GPU access to the > BO, it isn't related at all to possible CPU access and can actually prevent it a > number of cases. > > amdgpu_ttm_bind() in turn binds the BO to the GART space and isn't related > to CPU access either. > > What you most likely need to do here is to use > reservation_object_wait_timeout_rcu() to wait for all GPU operations to end. > > Regards, > Christian. > > > > > Sam > > > >> -----Original Message----- > >> From: Koenig, Christian > >> Sent: Wednesday, September 20, 2017 1:38 PM > >> To: Li, Samuel <Samuel.Li at amd.com>; amd-gfx at lists.freedesktop.org > >> Subject: Re: [PATCH v2 1/1] drm/amdgpu: Add gem_prime_mmap > support > >> > >> Am 20.09.2017 um 19:34 schrieb Li, Samuel: > >>>> If that is what this callback should do then this implementation > >>>> would be incorrect. Pinning doesn't wait for any GPU operation to > finish. > >>> During pining, it will all the fences to finish. That shall be OK. > >> No that isn't correct. Pinning just notes that the BO shouldn't be > >> moved any more. > >> > >> It doesn't wait for anything. > >> > >> Christian. > >> > >>> Sam > >>> > >>> > >>> > >>>> -----Original Message----- > >>>> From: Koenig, Christian > >>>> Sent: Wednesday, September 20, 2017 12:21 PM > >>>> To: Li, Samuel <Samuel.Li at amd.com>; amd-gfx at lists.freedesktop.org > >>>> Subject: Re: [PATCH v2 1/1] drm/amdgpu: Add gem_prime_mmap > >> support > >>>> Am 20.09.2017 um 17:44 schrieb Li, Samuel: > >>>>>> What happens when another thread is using amdgpu_dmabuf_ops > to > >> call > >>>>>> begin_cpu_access/end_cpu_access when you are fixing it up again? > >>>>> Right, that is an issue. > >>>> A simple "if (!amdgpu_dmabuf_ops.begin_cpu_access)" should be > able > >> to > >>>> deal with that. > >>>> > >>>>>> I would just completely drop the two callbacks, pinning is not > >>>>>> necessary for CPU access and thinking more about it it actually > >>>>>> has some unwanted side effects. > >>>>> CPU access needs synchronization anyway, so the two callbacks > >>>>> cannot be > >>>> dropped (other drivers implemented too), so I would like to keep it > >>>> there for now. > >>>> > >>>> Wait a second what do you mean with "CPU access needs > >> synchronization"? > >>>> At least i915 makes the memory GPU inaccessible when you start to > >>>> use it with the CPU. > >>>> > >>>> If that is what this callback should do then this implementation > >>>> would be incorrect. Pinning doesn't wait for any GPU operation to > finish. > >>>> > >>>> Regards, > >>>> Christian. > >>>> > >>>>> Sam > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Koenig, Christian > >>>>>> Sent: Wednesday, September 20, 2017 2:58 AM > >>>>>> To: Li, Samuel <Samuel.Li at amd.com>; amd- > gfx at lists.freedesktop.org > >>>>>> Subject: Re: [PATCH v2 1/1] drm/amdgpu: Add gem_prime_mmap > >>>> support > >>>>>>> What do you mean "This isn't race free"? > >>>>>> Take a look at the code again: > >>>>>>> + dma_buf = drm_gem_prime_export(dev, gobj, flags); > >>>>>>> + amdgpu_dmabuf_ops = *(dma_buf->ops); > >>>>>>> + amdgpu_dmabuf_ops.begin_cpu_access = > >>>>>> amdgpu_gem_begin_cpu_access; > >>>>>>> + amdgpu_dmabuf_ops.end_cpu_access = > >>>>>> amdgpu_gem_end_cpu_access; > >>>>>>> + dma_buf->ops = &amdgpu_dmabuf_ops; > >>>>>> What happens when another thread is using amdgpu_dmabuf_ops > to > >> call > >>>>>> begin_cpu_access/end_cpu_access when you are fixing it up again? > >>>>>> > >>>>>> I would just completely drop the two callbacks, pinning is not > >>>>>> necessary for CPU access and thinking more about it it actually > >>>>>> has some unwanted side effects. > >>>>>> > >>>>>> Regards, > >>>>>> Christian. > >>>>>> > >>>>>> Am 19.09.2017 um 23:22 schrieb Samuel Li: > >>>>>>>>> + vma->vm_pgoff = amdgpu_bo_mmap_offset(bo) >> > >> PAGE_SHIFT; > >>>>>>>> Maybe better use "vma->vm_pgoff += > >>>> amdgpu_bo_mmap_offset(bo) >> > >>>>>> PAGE_SHIFT;", but I'm not sure. > >>>>>>>> How other drivers handle this? > >>>>>>> This is a good catch. Looks like pgoff is honored during prime > >>>>>>> mmap, not a > >>>>>> fake offset here. > >>>>>>>>> + dma_buf->ops = &amdgpu_dmabuf_ops; > >>>>>>>> This isn't race free and needs to be fixed. > >>>>>>>> Better add callbacks to drm_prime.c similar to > >>>>>> drm_gem_dmabuf_mmap(). > >>>>>>> What do you mean "This isn't race free"? > >>>>>>> > >>>>>>> Regards, > >>>>>>> Sam > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 2017-09-15 11:05 AM, Christian König wrote: > >>>>>>>> Am 14.09.2017 um 00:39 schrieb Samuel Li: > >>>>>>>>> v2: drop hdp invalidate/flush. > >>>>>>>>> > >>>>>>>>> Signed-off-by: Samuel Li <Samuel.Li at amd.com> > >>>>>>>>> --- > >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 3 ++ > >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 +- > >>>>>>>>> drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c | 77 > >>>>>> ++++++++++++++++++++++++++++++- > >>>>>>>>> 3 files changed, 81 insertions(+), 2 deletions(-) > >>>>>>>>> > >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >>>>>>>>> index d2aaad7..188b705 100644 > >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h > >>>>>>>>> @@ -395,11 +395,14 @@ > >>>> amdgpu_gem_prime_import_sg_table(struct > >>>>>> drm_device *dev, > >>>>>>>>> struct dma_buf *amdgpu_gem_prime_export(struct > >> drm_device > >>>> *dev, > >>>>>>>>> struct drm_gem_object *gobj, > >>>>>>>>> int flags); > >>>>>>>>> +struct drm_gem_object *amdgpu_gem_prime_import(struct > >>>>>> drm_device *dev, > >>>>>>>>> + struct dma_buf *dma_buf); > >>>>>>>>> int amdgpu_gem_prime_pin(struct drm_gem_object *obj); > >>>>>>>>> void amdgpu_gem_prime_unpin(struct drm_gem_object > *obj); > >>>>>>>>> struct reservation_object > >>>>>>>>> *amdgpu_gem_prime_res_obj(struct > >>>>>> drm_gem_object *); > >>>>>>>>> void *amdgpu_gem_prime_vmap(struct drm_gem_object > *obj); > >>>>>>>>> void amdgpu_gem_prime_vunmap(struct drm_gem_object > >> *obj, > >>>> void > >>>>>>>>> *vaddr); > >>>>>>>>> +int amdgpu_gem_prime_mmap(struct drm_gem_object *obj, > >> struct > >>>>>>>>> +vm_area_struct *vma); > >>>>>>>>> int amdgpu_gem_debugfs_init(struct amdgpu_device *adev); > >>>>>>>>> /* sub-allocation manager, it has to be protected by > >>>>>>>>> another > >> lock. > >>>>>>>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >>>>>>>>> index 2cdf844..9b63ac5 100644 > >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c > >>>>>>>>> @@ -835,7 +835,7 @@ static struct drm_driver kms_driver = { > >>>>>>>>> .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > >>>>>>>>> .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > >>>>>>>>> .gem_prime_export = amdgpu_gem_prime_export, > >>>>>>>>> - .gem_prime_import = drm_gem_prime_import, > >>>>>>>>> + .gem_prime_import = amdgpu_gem_prime_import, > >>>>>>>>> .gem_prime_pin = amdgpu_gem_prime_pin, > >>>>>>>>> .gem_prime_unpin = amdgpu_gem_prime_unpin, > >>>>>>>>> .gem_prime_res_obj = amdgpu_gem_prime_res_obj, @@ > >>>>>>>>> -843,6 > >>>>>>>>> +843,7 @@ static struct drm_driver kms_driver = { > >>>>>>>>> .gem_prime_import_sg_table = > >>>>>> amdgpu_gem_prime_import_sg_table, > >>>>>>>>> .gem_prime_vmap = amdgpu_gem_prime_vmap, > >>>>>>>>> .gem_prime_vunmap = amdgpu_gem_prime_vunmap, > >>>>>>>>> + .gem_prime_mmap = amdgpu_gem_prime_mmap, > >>>>>>>>> .name = DRIVER_NAME, > >>>>>>>>> .desc = DRIVER_DESC, diff --git > >>>>>>>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > >>>>>>>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > >>>>>>>>> index 5b3f928..13c977a 100644 > >>>>>>>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > >>>>>>>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c > >>>>>>>>> @@ -57,6 +57,40 @@ void amdgpu_gem_prime_vunmap(struct > >>>>>> drm_gem_object *obj, void *vaddr) > >>>>>>>>> ttm_bo_kunmap(&bo->dma_buf_vmap); > >>>>>>>>> } > >>>>>>>>> +int amdgpu_gem_prime_mmap(struct drm_gem_object > *obj, > >>>> struct > >>>>>>>>> vm_area_struct *vma) > >>>>>>>>> +{ > >>>>>>>>> + struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); > >>>>>>>>> + struct amdgpu_device *adev = amdgpu_ttm_adev(bo- > >>> tbo.bdev); > >>>>>>>>> + unsigned asize = amdgpu_bo_size(bo); > >>>>>>>>> + int ret; > >>>>>>>>> + > >>>>>>>>> + if (!vma->vm_file) > >>>>>>>>> + return -ENODEV; > >>>>>>>>> + > >>>>>>>>> + if (adev == NULL) > >>>>>>>>> + return -ENODEV; > >>>>>>>>> + > >>>>>>>>> + /* Check for valid size. */ > >>>>>>>>> + if (asize < vma->vm_end - vma->vm_start) > >>>>>>>>> + return -EINVAL; > >>>>>>>>> + > >>>>>>>>> + if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm) || > >>>>>>>>> + (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)) { > >>>>>>>>> + return -EPERM; > >>>>>>>>> + } > >>>>>>>>> + vma->vm_pgoff = amdgpu_bo_mmap_offset(bo) >> > >> PAGE_SHIFT; > >>>>>>>> Maybe better use "vma->vm_pgoff += > >>>> amdgpu_bo_mmap_offset(bo) >> > >>>>>> PAGE_SHIFT;", but I'm not sure. > >>>>>>>> How other drivers handle this? > >>>>>>>> > >>>>>>>>> + > >>>>>>>>> + /* prime mmap does not need to check access, so allow here > */ > >>>>>>>>> + ret = drm_vma_node_allow(&obj->vma_node, vma- > >vm_file- > >>>>>>> private_data); > >>>>>>>>> + if (ret) > >>>>>>>>> + return ret; > >>>>>>>>> + > >>>>>>>>> + ret = ttm_bo_mmap(vma->vm_file, vma, &adev- > >mman.bdev); > >>>>>>>>> + drm_vma_node_revoke(&obj->vma_node, > >>>>>>>>> + vma->vm_file->private_data); > >>>>>>>>> + > >>>>>>>>> + return ret; > >>>>>>>>> +} > >>>>>>>>> + > >>>>>>>>> struct drm_gem_object * > >>>>>>>>> amdgpu_gem_prime_import_sg_table(struct drm_device > *dev, > >>>>>>>>> struct dma_buf_attachment *attach, @@ > >>>>>>>>> -130,14 > >>>>>>>>> +164,55 @@ struct reservation_object > >>>>>> *amdgpu_gem_prime_res_obj(struct drm_gem_object *obj) > >>>>>>>>> return bo->tbo.resv; > >>>>>>>>> } > >>>>>>>>> +static int amdgpu_gem_begin_cpu_access(struct dma_buf > >>>>>>>>> *dma_buf, enum dma_data_direction direction) > >>>>>>>>> +{ > >>>>>>>>> + return amdgpu_gem_prime_pin(dma_buf->priv); > >>>>>>>>> +} > >>>>>>>>> + > >>>>>>>>> +static int amdgpu_gem_end_cpu_access(struct dma_buf > >> *dma_buf, > >>>>>> enum > >>>>>>>>> +dma_data_direction direction) { > >>>>>>>>> + amdgpu_gem_prime_unpin(dma_buf->priv); > >>>>>>>>> + > >>>>>>>>> + return 0; > >>>>>>>>> +} > >>>>>>>>> + > >>>>>>>>> +static struct dma_buf_ops amdgpu_dmabuf_ops; > >>>>>>>>> + > >>>>>>>>> struct dma_buf *amdgpu_gem_prime_export(struct > >> drm_device > >>>> *dev, > >>>>>>>>> struct drm_gem_object *gobj, > >>>>>>>>> int flags) > >>>>>>>>> { > >>>>>>>>> struct amdgpu_bo *bo = gem_to_amdgpu_bo(gobj); > >>>>>>>>> + struct dma_buf *dma_buf; > >>>>>>>>> if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm)) > >>>>>>>>> return ERR_PTR(-EPERM); > >>>>>>>>> - return drm_gem_prime_export(dev, gobj, flags); > >>>>>>>>> + dma_buf = drm_gem_prime_export(dev, gobj, flags); > >>>>>>>>> + amdgpu_dmabuf_ops = *(dma_buf->ops); > >>>>>>>>> + amdgpu_dmabuf_ops.begin_cpu_access = > >>>>>> amdgpu_gem_begin_cpu_access; > >>>>>>>>> + amdgpu_dmabuf_ops.end_cpu_access = > >>>>>> amdgpu_gem_end_cpu_access; > >>>>>>>>> + dma_buf->ops = &amdgpu_dmabuf_ops; > >>>>>>>> This isn't race free and needs to be fixed. > >>>>>>>> > >>>>>>>> Better add callbacks to drm_prime.c similar to > >>>>>> drm_gem_dmabuf_mmap(). > >>>>>>>> Alternative you could just completely drop > >>>>>> amdgpu_gem_begin_cpu_access() and > >> amdgpu_gem_end_cpu_access() > >>>> as > >>>>>> well. > >>>>>>>> When the buffer is shared between device it is pinned anyway > >>>>>>>> and when > >>>>>> it isn't shared ttm_bo_mmap() is able to handle VRAM mappings as > >> well. > >>>>>>>> Regards, > >>>>>>>> Christian. > >>>>>>>> > >>>>>>>>> + > >>>>>>>>> + return dma_buf; > >>>>>>>>> +} > >>>>>>>>> + > >>>>>>>>> +struct drm_gem_object *amdgpu_gem_prime_import(struct > >>>>>> drm_device *dev, > >>>>>>>>> + struct dma_buf *dma_buf) { > >>>>>>>>> + struct drm_gem_object *obj; > >>>>>>>>> + > >>>>>>>>> + if (dma_buf->ops == &amdgpu_dmabuf_ops) { > >>>>>>>>> + obj = dma_buf->priv; > >>>>>>>>> + if (obj->dev == dev) { > >>>>>>>>> + /* > >>>>>>>>> + * Importing dmabuf exported from out own gem > increases > >>>>>>>>> + * refcount on gem itself instead of f_count of dmabuf. > >>>>>>>>> + */ > >>>>>>>>> + drm_gem_object_get(obj); > >>>>>>>>> + return obj; > >>>>>>>>> + } > >>>>>>>>> + } > >>>>>>>>> + > >>>>>>>>> + return drm_gem_prime_import(dev, dma_buf); > >>>>>>>>> } > > _______________________________________________ > > amd-gfx mailing list > > amd-gfx at lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/amd-gfx >