Re: [PATCH 3/5] drm/panfrost: Add a no execute flag for BO allocations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 19/07/2019 23:07, Rob Herring wrote:
On Fri, Jul 19, 2019 at 4:39 AM Steven Price <steven.price@xxxxxxx> wrote:

On 18/07/2019 18:03, Rob Herring wrote:
On Thu, Jul 18, 2019 at 9:03 AM Steven Price <steven.price@xxxxxxx> wrote:

On 17/07/2019 19:33, Rob Herring wrote:
Executable buffers have an alignment restriction that they can't cross
16MB boundary as the GPU program counter is 24-bits. This restriction is
currently not handled and we just get lucky. As current userspace

Actually it depends which GPU you are using - some do have a bigger
program counter - so it might be the choice of GPU to test on rather
than just luck. But kbase always assumes the worst case (24 bit) as in
practise that's enough.

assumes all BOs are executable, that has to remain the default. So add a
new PANFROST_BO_NOEXEC flag to allow userspace to indicate which BOs are
not executable.

There is also a restriction that executable buffers cannot start or end
on a 4GB boundary. This is mostly avoided as there is only 4GB of space
currently and the beginning is already blocked out for NULL ptr
detection. Add support to handle this restriction fully regardless of
the current constraints.

For existing userspace, all created BOs remain executable, but the GPU
VA alignment will be increased to the size of the BO. This shouldn't
matter as there is plenty of GPU VA space.

Cc: Tomeu Vizoso <tomeu.vizoso@xxxxxxxxxxxxx>
Cc: Boris Brezillon <boris.brezillon@xxxxxxxxxxxxx>
Cc: Robin Murphy <robin.murphy@xxxxxxx>
Cc: Steven Price <steven.price@xxxxxxx>
Cc: Alyssa Rosenzweig <alyssa@xxxxxxxxxxxxx>
Signed-off-by: Rob Herring <robh@xxxxxxxxxx>
---
  drivers/gpu/drm/panfrost/panfrost_drv.c | 20 +++++++++++++++++++-
  drivers/gpu/drm/panfrost/panfrost_gem.c | 18 ++++++++++++++++--
  drivers/gpu/drm/panfrost/panfrost_gem.h |  3 ++-
  drivers/gpu/drm/panfrost/panfrost_mmu.c |  3 +++
  include/uapi/drm/panfrost_drm.h         |  2 ++
  5 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_drv.c b/drivers/gpu/drm/panfrost/panfrost_drv.c
index d354b92964d5..b91e991bc6a3 100644
--- a/drivers/gpu/drm/panfrost/panfrost_drv.c
+++ b/drivers/gpu/drm/panfrost/panfrost_drv.c
@@ -49,7 +49,8 @@ static int panfrost_ioctl_create_bo(struct drm_device *dev, void *data,
       struct panfrost_gem_object *bo;
       struct drm_panfrost_create_bo *args = data;

-     if (!args->size || args->flags || args->pad)
+     if (!args->size || args->pad ||
+         (args->flags & ~PANFROST_BO_NOEXEC))
               return -EINVAL;

       bo = panfrost_gem_create_with_handle(file, dev, args->size, args->flags,
@@ -367,6 +368,22 @@ static struct drm_driver panfrost_drm_driver = {
       .gem_prime_mmap         = drm_gem_prime_mmap,
  };

+#define PFN_4G_MASK    ((SZ_4G - 1) >> PAGE_SHIFT)
+
+static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
+                                      unsigned long color,
+                                      u64 *start, u64 *end)
+{
+     /* Executable buffers can't start or end on a 4GB boundary */
+     if (!color) {
+             if ((*start & PFN_4G_MASK) == 0)
+                     (*start)++;
+
+             if ((*end & PFN_4G_MASK) == 0)
+                     (*end)--;
+     }
+}

Unless I'm mistaken this won't actually provide the guarantee if the
memory region is >4GB (which admittedly it isn't at the moment). For
example a 8GB region would have the beginning/end trimmed off, but
there's another 4GB in the middle which could be allocated next to.

Humm, other than not allowing sizes greater than 4G-(2*PAGE_SIZE), I'm
not sure how we solve that. I guess avoiding sizes of (n*4G -
PAGE_SIZE) would avoid the problem. Seems like almost 4GB executable
buffer should be enough for anyone(TM).

I was thinking of something like:

if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) {
         if ((*start & PFN_4G_MASK) +
             (SZ_16M >> PAGE_SHIFT) < PFN_4G_MASK)
                 *end = (*start & ~PFN_4G_MASK) +
                         (SZ_4GB - SZ_16M) >> PAGE_SHIFT;
         else
                 *start = (*end & ~PFN_4G_MASK) + (SZ_16M) >> PAGE_SHIFT;
}

So split the region depending on where we can find a free 16MB region
which doesn't cross 4GB.

Here's what I ended up with. It's slightly different in that the start
and end don't get 16MB aligned. The code already takes care of the
alignment which also is not necessarily 16MB, but 'align = size' as
that's sufficient to not cross a 16MB boundary.

#define PFN_4G          (SZ_4G >> PAGE_SHIFT)
#define PFN_4G_MASK     (PFN_4G - 1)
#define PFN_16M         (SZ_16M >> PAGE_SHIFT)

static void panfrost_drm_mm_color_adjust(const struct drm_mm_node *node,
                                          unsigned long color,
                                          u64 *start, u64 *end)
{
         /* Executable buffers can't start or end on a 4GB boundary */
         if (!(color & PANFROST_BO_NOEXEC)) {
                 if ((*start & PFN_4G_MASK) == 0)
                         (*start)++;

                 if ((*end & PFN_4G_MASK) == 0)
                         (*end)--;

                 /* Ensure start and end are in the same 4GB range */
                 if ((*end & ~PFN_4G_MASK) != (*start & ~PFN_4G_MASK)) {
                         if ((*start & PFN_4G_MASK) + PFN_16M < PFN_4G_MASK)
                                 *end = (*start & ~PFN_4G_MASK) + PFN_4G - 1;
                         else
                                 *start = (*end & ~PFN_4G_MASK) + 1;

                 }
         }
}

If you're happy with the additional restriction that a buffer can't cross a 4GB boundary (which does seem to be required for certain kinds of buffers anyway), then I don't think it needs to be anywhere near that complex:

	if (!(*start & PFN_4G_MASK))
		*start++
	*end = min(*end, ALIGN(*start, PFN_4G) - 1);


But like you say: 4GB should be enough for anyone ;)

Also a minor issue, but we might want to consider having some constants
for 'color' - it's not obvious from this code that color==no_exec.

Yeah, I was just going worry about that when we had a second bit to pass.

One other 'minor' issue I noticed as I was writing the above. PAGE_SIZE
is of course the CPU page size - we could have the situation of CPU and
GPU page size being different (e.g. CONFIG_ARM64_64K_PAGES). I'm not
sure whether we want to support this or not (kbase doesn't). Also in
theory some GPUs do support 64K (AS_TRANSCFG_ADRMODE_AARCH64_64K) - but
kbase has never used this so I don't know if it works... ;)

Shhh! I have thought about that, but was ignoring for now.

In general, I don't think that should be too great a concern - we already handle it for IOMMUs, provided the minimum IOMMU page size is no larger than PAGE_SIZE, which should always be the case here too. Looking at how panfrost_mmu_map() is implemented, and given that we're already trying to handle stuff at 2MB granularity where possible, I don't see any obvious reasons for 64K pages not to work already (assuming there aren't any 4K assumptions baked into the userspace side). I might have to give it a try...

Robin.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux