Re: [PATCH] drm/msm: Check for the GPU IOMMU during bind

Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> · Thu, 20 Jul 2023 20:11:22 +0300

On 20/07/2023 18:52, Rob Clark wrote:
On Thu, Jul 6, 2023 at 11:55 AM Dmitry Baryshkov
<dmitry.baryshkov@xxxxxxxxxx> wrote:

On 10/03/2023 00:20, Jordan Crouse wrote:
While booting with amd,imageon on a headless target the GPU probe was
failing with -ENOSPC in get_pages() from msm_gem.c.

Investigation showed that the driver was using the default 16MB VRAM
carveout because msm_use_mmu() was returning false since headless devices
use a dummy parent device. Avoid this by extending the existing is_a2xx
priv member to check the GPU IOMMU state on all platforms and use that
check in msm_use_mmu().

This works for memory allocations but it doesn't prevent the VRAM carveout
from being created because that happens before we have a chance to check
the GPU IOMMU state in adreno_bind.

There are a number of possible options to resolve this but none of them are
very clean. The easiest way is to likely specify vram=0 as module parameter
on headless devices so that the memory doesn't get wasted.

This patch was on my plate for quite a while, please excuse me for
taking it so long.

I see the following problem with the current code. We have two different
instances than can access memory: MDP/DPU and GPU. And each of them can
either have or miss the MMU.

For some time I toyed with the idea of determining whether the allocated
BO is going to be used by display or by GPU, but then I abandoned it. We
can have display BOs being filled by GPU, so handling it this way would
complicate things a lot.

There is MSM_BO_SCANOUT .. but it wouldn't completely surprise me if
it isn't used in some place where it should somewhere or other.  But
that is the hint that contiguous allocation should be used if the
display doesn't support some sort of iommu.  (Using a GPU without some
sort of mmu/iommu isn't something sane to do.. the only reason the
support for that exists at all is to aid bringup.  I wouldn't call
that a "supported" configuration.)

This actually rings a tiny bell in my head with the idea of splitting
the display and GPU parts to two different drivers, but I'm not sure
what would be the overall impact.

Userspace does have better support for split display/gpu these days
than it did when drm/msm was first merged.  It _might_ just work if
one device only advertised DRIVER_RENDER and the other
MODESET/ATOMIC.. but I'd be a bit concerned about breaking things.  I
guess you could try some sort of kconfig knob to have two "msm"
devices and see what breaks, but I'm a bit skeptical that we could
make this the default anytime soon.

Thanks. Yes, breaking userspace would be a bad thing. I do not know if 
we should consider a single GPU+KMS driver to be an ABI and thus set in 
stone.


For now, just addressing the only-display and only-gpu cases
(continuing with the single device arrangement when you have both
display and gpu), maybe split up drm_dev_alloc() and drm_dev_init() so
that we could use drm_device::driver_features to mask out
DRIVER_RENDER if needed.

Yep. I'll continue following that path.


BR,
-R

More on the msm_use_mmu() below.


Signed-off-by: Jordan Crouse <jorcrous@xxxxxxxxxx>
---

   drivers/gpu/drm/msm/adreno/adreno_device.c | 6 +++++-
   drivers/gpu/drm/msm/msm_drv.c              | 7 +++----
   drivers/gpu/drm/msm/msm_drv.h              | 2 +-
   3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_device.c b/drivers/gpu/drm/msm/adreno/adreno_device.c
index 36f062c7582f..4f19da28f80f 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_device.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_device.c
@@ -539,7 +539,11 @@ static int adreno_bind(struct device *dev, struct device *master, void *data)
       DBG("Found GPU: %u.%u.%u.%u", config.rev.core, config.rev.major,
               config.rev.minor, config.rev.patchid);

-     priv->is_a2xx = config.rev.core == 2;
+     /*
+      * A2xx has a built in IOMMU and all other IOMMU enabled targets will
+      * have an ARM IOMMU attached
+      */
+     priv->has_gpu_iommu = config.rev.core == 2 || device_iommu_mapped(dev);
       priv->has_cached_coherent = config.rev.core >= 6;

       gpu = info->init(drm);
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index aca48c868c14..a125a351ec90 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -318,11 +318,10 @@ bool msm_use_mmu(struct drm_device *dev)
       struct msm_drm_private *priv = dev->dev_private;

       /*
-      * a2xx comes with its own MMU
-      * On other platforms IOMMU can be declared specified either for the
-      * MDP/DPU device or for its parent, MDSS device.
+      * Return true if the GPU or the MDP/DPU or parent MDSS device has an
+      * IOMMU
        */
-     return priv->is_a2xx ||
+     return priv->has_gpu_iommu ||
               device_iommu_mapped(dev->dev) ||
               device_iommu_mapped(dev->dev->parent);

I have a generic feeling that both old an new code is not fully correct.
Please correct me if I'm wrong:

We should be using VRAM, if either of the blocks can not use remapped
memory. So this should have been:

bool msm_use_mmu()
{
   if (!gpu_has_iommu)
     return false;

   if (have_display_part && !display_has_mmu())
     return false;

   return true;
}

What do you think.

   }
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index 9f0c184b02a0..f33f94acd1b9 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -126,7 +126,7 @@ struct msm_drm_private {
       struct msm_gpu *gpu;

       /* gpu is only set on open(), but we need this info earlier */
-     bool is_a2xx;
+     bool has_gpu_iommu;
       bool has_cached_coherent;

       struct drm_fb_helper *fbdev;

--
With best wishes
Dmitry


--
With best wishes
Dmitry