Re: [RESEND PATCH v2 4/5] drm/msm: add DRM_MSM_GEM_SYNC_CACHE for non-coherent cache maintenance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/14/20 2:39 PM, Rob Clark wrote:
On Sat, Nov 14, 2020 at 10:58 AM Jonathan Marek <jonathan@xxxxxxxx> wrote:

On 11/14/20 1:46 PM, Rob Clark wrote:
On Sat, Nov 14, 2020 at 8:24 AM Christoph Hellwig <hch@xxxxxx> wrote:

On Sat, Nov 14, 2020 at 10:17:12AM -0500, Jonathan Marek wrote:
+void msm_gem_sync_cache(struct drm_gem_object *obj, uint32_t flags,
+             size_t range_start, size_t range_end)
+{
+     struct msm_gem_object *msm_obj = to_msm_bo(obj);
+     struct device *dev = msm_obj->base.dev->dev;
+
+     /* exit early if get_pages() hasn't been called yet */
+     if (!msm_obj->pages)
+             return;
+
+     /* TODO: sync only the specified range */
+
+     if (flags & MSM_GEM_SYNC_FOR_DEVICE) {
+             dma_sync_sg_for_device(dev, msm_obj->sgt->sgl,
+                             msm_obj->sgt->nents, DMA_TO_DEVICE);
+     }
+
+     if (flags & MSM_GEM_SYNC_FOR_CPU) {
+             dma_sync_sg_for_cpu(dev, msm_obj->sgt->sgl,
+                             msm_obj->sgt->nents, DMA_FROM_DEVICE);
+     }

Splitting this helper from the only caller is rather strange, epecially
with the two unused arguments.  And I think the way this is specified
to take a range, but ignoring it is actively dangerous.  User space will
rely on it syncing everything sooner or later and then you are stuck.
So just define a sync all primitive for now, and if you really need a
range sync and have actually implemented it add a new ioctl for that.

We do already have a split of ioctl "layer" which enforces valid ioctl
params, etc, and gem (or other) module code which is called by the
ioctl func.  So I think it is fine to keep this split here.  (Also, I
think at some point there will be a uring type of ioctl alternative
which would re-use the same gem func.)

But I do agree that the range should be respected or added later..
drm_ioctl() dispatch is well prepared for extending ioctls.

And I assume there should be some validation that the range is aligned
to cache-line?  Or can we flush a partial cache line?


The range is intended to be "sync at least this range", so that
userspace doesn't have to worry about details like that.


I don't think userspace can *not* worry about details like that.
Consider a case where the cpu and gpu are simultaneously accessing
different parts of a buffer (for ex, sub-allocation).  There needs to
be cache-line separation between the two.


Right.. and it also seems like we can't get away with just flushing/invalidating the whole thing.

qcom's vulkan driver has nonCoherentAtomSize=1, and it looks like dma_sync_single_for_cpu() does deal in some way with the partial cache line case, although I'm not sure that means we can have a nonCoherentAtomSize=1.

BR,
-R




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux