Re: [PATCH RFC 00/24] Lima DRM driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 23.05.2018 um 16:13 schrieb Qiang Yu:
On Wed, May 23, 2018 at 9:59 PM, Christian König
<christian.koenig@xxxxxxx> wrote:
Am 23.05.2018 um 15:52 schrieb Qiang Yu:
On Wed, May 23, 2018 at 5:29 PM, Christian König
<ckoenig.leichtzumerken@xxxxxxxxx> wrote:
Am 18.05.2018 um 11:27 schrieb Qiang Yu:
Kernel DRM driver for ARM Mali 400/450 GPUs.

This implementation mainly take amdgpu DRM driver as reference.

- Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
     OpenGL vertex shader processing and PP is for fragment shader
     processing. Each processor has its own MMU so prcessors work in
     virtual address space.
- There's only one GP but multiple PP (max 4 for mali 400 and 8
     for mali 450) in the same mali 4xx GPU. All PPs are grouped
     togather to handle a single fragment shader task divided by
     FB output tiled pixels. Mali 400 user space driver is
     responsible for assign target tiled pixels to each PP, but mali
     450 has a HW module called DLBU to dynamically balance each
     PP's load.
- User space driver allocate buffer object and map into GPU
     virtual address space, upload command stream and draw data with
     CPU mmap of the buffer object, then submit task to GP/PP with
     a register frame indicating where is the command stream and misc
     settings.
- There's no command stream validation/relocation due to each user
     process has its own GPU virtual address space. GP/PP's MMU switch
     virtual address space before running two tasks from different
     user process. Error or evil user space code just get MMU fault
     or GP/PP error IRQ, then the HW/SW will be recovered.
- Use TTM as MM. TTM_PL_TT type memory is used as the content of
     lima buffer object which is allocated from TTM page pool. all
     lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when
     allocation, so there's no buffer eviction and swap for now. We
     need reverse engineering to see if and how GP/PP support MMU
     fault recovery (continue execution). Otherwise we have to
     pin/unpin each envolved buffer when task creation/deletion.

Well pinning all memory is usually a no-go for upstreaming. But since you
are already using the drm_sched for GPU task scheduling why are you
actually
needing this?

The scheduler should take care of signaling all fences when the hardware
is
done with it's magic and that is enough for TTM to note that a buffer
object
is movable again (e.g. unpin them).
Please correct me if I'm wrong.

Well, you are wrong :)

One way to implement eviction/swap is like this:
call validation on each buffers involved in a task, but this won't
prevent it from
eviction/swap when executing, so a GPU MMU fault may happen and in the
handler we need to recover the buffer evicted/swapped.

Another way is pin/unpin buffers evolved when task create/free.

First way is better when memory load is low and second way is better when
memory load is high. First way also need less memory.

So I'd prefer first way but due to the GPU MMU fault
HW op need reverse engineering, I have to pin all buffers now. After
the HW op is clear, I can choose one way to implement.

The general approach is:
1.) Lock all BOs
2.) Validate all BOs
3.) Add the fence
4.) Unlock the BOs
This is the task prepare process, right?

Yes.

BOs can't be evicted while they are locked
During the task prepare stage, they're locked, but after task queued, they
get unlocked and be evictable?

Yes, the fence you added to the BO prevents TTM from evicting the BO until the fence signaled.


and since you already add the
fence that should be perfectly sufficient to prevent it from being evicted
until your operation is completed.
You mean I have to explicitly pin it with TTM_PL_FLAG_NO_EVICT
when task creation or TTM will check buffer's reservation object and
won't evict it if see a fence?

The second. You *don't* have to explicitly pin it with TTM_PL_FLAG_NO_EVICT as long as you always add the correct fence with your command submissions.

When evicting something TTM will take a look at the fences assigned to the BO and either don't evict it at all or wait for all fences to be completed before doing so.

When you need to update some internal state or flush caches or stuff like that when a BO is evicted TTM also has callbacks for this.

Regards,
Christian.


Regards,
Qiang

Using the MMU is certainly be better in general, but usually only optional
and a pain in the ass to get working. We have that in amdgpu for quite a
while as well now and still don't use it because of that.

Regards,
Christian.


Regards,
Qiang

Christian.


- Use drm_sched for GPU task schedule. Each OpenGL context should
     have a lima context object in the kernel to distinguish tasks
     from different user. drm_sched gets task from each lima context
     in a fair way.

Not implemented:
- Dump buffer support
- Power management
- Performance counter

This patch serial just pack a pair of .c/.h files in each patch.
For whole history of this driver's development, see:
https://github.com/yuq/linux-lima/commits/lima-4.17-rc4

Mesa driver is still in development and not ready for daily usage,
but can run some simple tests like kmscube and glamrk2, see:
https://github.com/yuq/mesa-lima

Andrei Paulau (1):
     arm64/dts: add switch-delay for meson mali

Lima Project Developers (10):
     drm/lima: add mali 4xx GPU hardware regs
     drm/lima: add lima core driver
     drm/lima: add GPU device functions
     drm/lima: add PMU related functions
     drm/lima: add PP related functions
     drm/lima: add MMU related functions
     drm/lima: add GPU virtual memory space handing
     drm/lima: add GEM related functions
     drm/lima: add GEM Prime related functions
     drm/lima: add makefile and kconfig

Qiang Yu (12):
     dt-bindings: add switch-delay property for mali-utgard
     arm64/dts: add switch-delay for meson mali
     Revert "drm: Nerf the preclose callback for modern drivers"
     drm/lima: add lima uapi header
     drm/lima: add L2 cache functions
     drm/lima: add GP related functions
     drm/lima: add BCAST related function
     drm/lima: add DLBU related functions
     drm/lima: add TTM subsystem functions
     drm/lima: add buffer object functions
     drm/lima: add GPU schedule using DRM_SCHED
     drm/lima: add context related functions

Simon Shields (1):
     ARM: dts: add gpu node to exynos4

    .../bindings/gpu/arm,mali-utgard.txt          |   4 +
    arch/arm/boot/dts/exynos4.dtsi                |  33 ++
    arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi   |   1 +
    .../boot/dts/amlogic/meson-gxl-mali.dtsi      |   1 +
    drivers/gpu/drm/Kconfig                       |   2 +
    drivers/gpu/drm/Makefile                      |   1 +
    drivers/gpu/drm/drm_file.c                    |   8 +-
    drivers/gpu/drm/lima/Kconfig                  |   9 +
    drivers/gpu/drm/lima/Makefile                 |  19 +
    drivers/gpu/drm/lima/lima_bcast.c             |  65 +++
    drivers/gpu/drm/lima/lima_bcast.h             |  34 ++
    drivers/gpu/drm/lima/lima_ctx.c               | 143 +++++
    drivers/gpu/drm/lima/lima_ctx.h               |  51 ++
    drivers/gpu/drm/lima/lima_device.c            | 407 ++++++++++++++
    drivers/gpu/drm/lima/lima_device.h            | 136 +++++
    drivers/gpu/drm/lima/lima_dlbu.c              |  75 +++
    drivers/gpu/drm/lima/lima_dlbu.h              |  37 ++
    drivers/gpu/drm/lima/lima_drv.c               | 466 ++++++++++++++++
    drivers/gpu/drm/lima/lima_drv.h               |  77 +++
    drivers/gpu/drm/lima/lima_gem.c               | 459 ++++++++++++++++
    drivers/gpu/drm/lima/lima_gem.h               |  41 ++
    drivers/gpu/drm/lima/lima_gem_prime.c         |  66 +++
    drivers/gpu/drm/lima/lima_gem_prime.h         |  31 ++
    drivers/gpu/drm/lima/lima_gp.c                | 293 +++++++++++
    drivers/gpu/drm/lima/lima_gp.h                |  34 ++
    drivers/gpu/drm/lima/lima_l2_cache.c          |  98 ++++
    drivers/gpu/drm/lima/lima_l2_cache.h          |  32 ++
    drivers/gpu/drm/lima/lima_mmu.c               | 154 ++++++
    drivers/gpu/drm/lima/lima_mmu.h               |  34 ++
    drivers/gpu/drm/lima/lima_object.c            | 120 +++++
    drivers/gpu/drm/lima/lima_object.h            |  87 +++
    drivers/gpu/drm/lima/lima_pmu.c               |  85 +++
    drivers/gpu/drm/lima/lima_pmu.h               |  30 ++
    drivers/gpu/drm/lima/lima_pp.c                | 418 +++++++++++++++
    drivers/gpu/drm/lima/lima_pp.h                |  37 ++
    drivers/gpu/drm/lima/lima_regs.h              | 304 +++++++++++
    drivers/gpu/drm/lima/lima_sched.c             | 497
++++++++++++++++++
    drivers/gpu/drm/lima/lima_sched.h             | 126 +++++
    drivers/gpu/drm/lima/lima_ttm.c               | 409 ++++++++++++++
    drivers/gpu/drm/lima/lima_ttm.h               |  44 ++
    drivers/gpu/drm/lima/lima_vm.c                | 312 +++++++++++
    drivers/gpu/drm/lima/lima_vm.h                |  73 +++
    include/drm/drm_drv.h                         |  23 +-
    include/uapi/drm/lima_drm.h                   | 195 +++++++
    44 files changed, 5565 insertions(+), 6 deletions(-)
    create mode 100644 drivers/gpu/drm/lima/Kconfig
    create mode 100644 drivers/gpu/drm/lima/Makefile
    create mode 100644 drivers/gpu/drm/lima/lima_bcast.c
    create mode 100644 drivers/gpu/drm/lima/lima_bcast.h
    create mode 100644 drivers/gpu/drm/lima/lima_ctx.c
    create mode 100644 drivers/gpu/drm/lima/lima_ctx.h
    create mode 100644 drivers/gpu/drm/lima/lima_device.c
    create mode 100644 drivers/gpu/drm/lima/lima_device.h
    create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c
    create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h
    create mode 100644 drivers/gpu/drm/lima/lima_drv.c
    create mode 100644 drivers/gpu/drm/lima/lima_drv.h
    create mode 100644 drivers/gpu/drm/lima/lima_gem.c
    create mode 100644 drivers/gpu/drm/lima/lima_gem.h
    create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c
    create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h
    create mode 100644 drivers/gpu/drm/lima/lima_gp.c
    create mode 100644 drivers/gpu/drm/lima/lima_gp.h
    create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c
    create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h
    create mode 100644 drivers/gpu/drm/lima/lima_mmu.c
    create mode 100644 drivers/gpu/drm/lima/lima_mmu.h
    create mode 100644 drivers/gpu/drm/lima/lima_object.c
    create mode 100644 drivers/gpu/drm/lima/lima_object.h
    create mode 100644 drivers/gpu/drm/lima/lima_pmu.c
    create mode 100644 drivers/gpu/drm/lima/lima_pmu.h
    create mode 100644 drivers/gpu/drm/lima/lima_pp.c
    create mode 100644 drivers/gpu/drm/lima/lima_pp.h
    create mode 100644 drivers/gpu/drm/lima/lima_regs.h
    create mode 100644 drivers/gpu/drm/lima/lima_sched.c
    create mode 100644 drivers/gpu/drm/lima/lima_sched.h
    create mode 100644 drivers/gpu/drm/lima/lima_ttm.c
    create mode 100644 drivers/gpu/drm/lima/lima_ttm.h
    create mode 100644 drivers/gpu/drm/lima/lima_vm.c
    create mode 100644 drivers/gpu/drm/lima/lima_vm.h
    create mode 100644 include/uapi/drm/lima_drm.h


--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux