Am Mittwoch, dem 21.06.2023 um 23:30 +0800 schrieb Sui Jingfeng: > Hi, > > On 2023/6/21 18:00, Lucas Stach wrote: > > > dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt, > > > etnaviv_op_to_dma_dir(op)); > > > etnaviv_obj->last_cpu_prep_op = op; > > > @@ -408,8 +421,9 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj) > > > { > > > struct drm_device *dev = obj->dev; > > > struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj); > > > + struct etnaviv_drm_private *priv = dev->dev_private; > > > > > > - if (etnaviv_obj->flags & ETNA_BO_CACHED) { > > > + if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) { > > > /* fini without a prep is almost certainly a userspace error */ > > > WARN_ON(etnaviv_obj->last_cpu_prep_op == 0); > > > dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt, > > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c > > > index 3524b5811682..754126992264 100644 > > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c > > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c > > > @@ -112,11 +112,16 @@ static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = { > > > struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev, > > > struct dma_buf_attachment *attach, struct sg_table *sgt) > > > { > > > + struct etnaviv_drm_private *priv = dev->dev_private; > > > struct etnaviv_gem_object *etnaviv_obj; > > > size_t size = PAGE_ALIGN(attach->dmabuf->size); > > > + u32 cache_flags = ETNA_BO_WC; > > > int ret, npages; > > > > > > - ret = etnaviv_gem_new_private(dev, size, ETNA_BO_WC, > > > + if (priv->dma_coherent) > > > + cache_flags = ETNA_BO_CACHED; > > > + > > Drop this change. Instead etnaviv_gem_new_impl() should do the upgrade > > from WC to CACHED as necessary by adding something like this: > > I understand you are a profession person in vivante GPU driver domain. > > I respect you reviews and instruction. > > But, I'm really reluctant to agree with this, is there any space to > negotiate? > > > /* > > * Upgrade WC to CACHED when the device is hardware coherent and the > > * platform doesn't allow mixing cached and writecombined mappings to > > * the same memory area. > > */ > > if ((flags & ETNA_BO_CACHE_MASK) == ETNA_BO_WC && > > dev_is_dma_coherent(dev) && !drm_arch_can_wc_memory()) > > flags = (flags & ~ETNA_BO_CACHE_MASK) & ETNA_BO_CACHED; > > This is policy, not a mechanism. > > Using what cache property is a user-space program's choice. > > While you are override the WC with CACHED mapping. This is not correct > in the concept! > Please explain why you think that this isn't correct. If using WC mappings cause a potential loss of coherency on your platform, then we can not allow the userspace driver to use WC mappings. As I would like to keep the option of WC mappings, I've asked you if there are ways to prepare the cache in a way that WC mappings aren't causing any troubles on your platform. You told me that this might be possible but needs confirmation from a HW engineer and such confirmation could take a long time. With that in mind, our only option right now is to upgrade the mappings to cached in order to not lay out traps for the userspace driver. > you approach forbidden any possibility to use the WC BO at anywhere. > > > My approach need only check once, while you approach need at least 3 > check plus > > so much bit-wise logic operations, plus a function call (&, ==, &&, > &, ~, &) . > > and every time you create a BO. This nasty judgement happens. > BO creation again is not a fast path. You are committing to allocate new memory, which is a few orders of magnitude more costly than the few instructions needed for those comparisons. > > Please keep our original implement, it's simple and clear, Please? > It isn't as simple and clear for the userspace interface. It allows userspace to use WC mappings that would potentially cause loss of coherency between CPU and GPU, which isn't acceptable. Regards, Lucas