Re: [PATCH v10 07/11] drm/etnaviv: Add support for the dma coherent device

Lucas Stach <l.stach@xxxxxxxxxxxxxx> · Wed, 21 Jun 2023 17:58:43 +0200

Am Mittwoch, dem 21.06.2023 um 23:30 +0800 schrieb Sui Jingfeng:
> Hi,
> 
> On 2023/6/21 18:00, Lucas Stach wrote:
> > >   		dma_sync_sgtable_for_cpu(dev->dev, etnaviv_obj->sgt,
> > >   					 etnaviv_op_to_dma_dir(op));
> > >   		etnaviv_obj->last_cpu_prep_op = op;
> > > @@ -408,8 +421,9 @@ int etnaviv_gem_cpu_fini(struct drm_gem_object *obj)
> > >   {
> > >   	struct drm_device *dev = obj->dev;
> > >   	struct etnaviv_gem_object *etnaviv_obj = to_etnaviv_bo(obj);
> > > +	struct etnaviv_drm_private *priv = dev->dev_private;
> > >   
> > > -	if (etnaviv_obj->flags & ETNA_BO_CACHED) {
> > > +	if (!priv->dma_coherent && etnaviv_obj->flags & ETNA_BO_CACHED) {
> > >   		/* fini without a prep is almost certainly a userspace error */
> > >   		WARN_ON(etnaviv_obj->last_cpu_prep_op == 0);
> > >   		dma_sync_sgtable_for_device(dev->dev, etnaviv_obj->sgt,
> > > diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > index 3524b5811682..754126992264 100644
> > > --- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
> > > @@ -112,11 +112,16 @@ static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = {
> > >   struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev,
> > >   	struct dma_buf_attachment *attach, struct sg_table *sgt)
> > >   {
> > > +	struct etnaviv_drm_private *priv = dev->dev_private;
> > >   	struct etnaviv_gem_object *etnaviv_obj;
> > >   	size_t size = PAGE_ALIGN(attach->dmabuf->size);
> > > +	u32 cache_flags = ETNA_BO_WC;
> > >   	int ret, npages;
> > >   
> > > -	ret = etnaviv_gem_new_private(dev, size, ETNA_BO_WC,
> > > +	if (priv->dma_coherent)
> > > +		cache_flags = ETNA_BO_CACHED;
> > > +
> > Drop this change. Instead etnaviv_gem_new_impl() should do the upgrade
> > from WC to CACHED as necessary by adding something like this:
> 
> I understand you are a profession person in vivante GPU driver domain.
> 
> I respect you reviews and instruction.
> 
> But, I'm really reluctant to agree with this, is there any space to 
> negotiate?
> 
> > /*
> >   * Upgrade WC to CACHED when the device is hardware coherent and the
> >   * platform doesn't allow mixing cached and writecombined mappings to
> >   * the same memory area.
> >   */
> > if ((flags & ETNA_BO_CACHE_MASK) == ETNA_BO_WC &&
> >      dev_is_dma_coherent(dev) && !drm_arch_can_wc_memory())
> >          flags = (flags & ~ETNA_BO_CACHE_MASK) & ETNA_BO_CACHED;
> 
> This is policy, not a mechanism.
> 
> Using what cache property is a user-space program's choice.
> 
> While you are override the WC with CACHED mapping. This is not correct 
> in the concept!
> 
Please explain why you think that this isn't correct. If using WC
mappings cause a potential loss of coherency on your platform, then we
can not allow the userspace driver to use WC mappings.

As I would like to keep the option of WC mappings, I've asked you if
there are ways to prepare the cache in a way that WC mappings aren't
causing any troubles on your platform. You told me that this might be
possible but needs confirmation from a HW engineer and such
confirmation could take a long time.

With that in mind, our only option right now is to upgrade the mappings
to cached  in order to not lay out traps for the userspace driver.

> you approach forbidden any possibility to use the WC BO at anywhere.
> 
> 
> My approach need only check once, while you approach need at least 3 
> check plus
> 
> so much bit-wise logic operations,  plus a function call  (&, ==, &&,  
> &, ~, &) .
> 
> and every time you create a BO. This nasty judgement happens.
> 
BO creation again is not a fast path. You are committing to allocate
new memory, which is a few orders of magnitude more costly than the few
instructions needed for those comparisons.

> 
> Please keep our original implement, it's simple and clear, Please?
> 

It isn't as simple and clear for the userspace interface. It allows
userspace to use WC mappings that would potentially cause loss of
coherency between CPU and GPU, which isn't acceptable.

Regards,
Lucas