Re: [PATCH RFC 00/24] Lima DRM driver

Qiang Yu <yuq825@xxxxxxxxx> · Thu, 24 May 2018 08:31:16 +0800



On Wed, May 23, 2018 at 11:44 PM, Daniel Vetter <daniel@xxxxxxxx> wrote:
> On Wed, May 23, 2018 at 3:52 PM, Qiang Yu <yuq825@xxxxxxxxx> wrote:
>> On Wed, May 23, 2018 at 5:29 PM, Christian König
>> <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>>> Am 18.05.2018 um 11:27 schrieb Qiang Yu:
>>>>
>>>> Kernel DRM driver for ARM Mali 400/450 GPUs.
>>>>
>>>> This implementation mainly take amdgpu DRM driver as reference.
>>>>
>>>> - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
>>>>    OpenGL vertex shader processing and PP is for fragment shader
>>>>    processing. Each processor has its own MMU so prcessors work in
>>>>    virtual address space.
>>>> - There's only one GP but multiple PP (max 4 for mali 400 and 8
>>>>    for mali 450) in the same mali 4xx GPU. All PPs are grouped
>>>>    togather to handle a single fragment shader task divided by
>>>>    FB output tiled pixels. Mali 400 user space driver is
>>>>    responsible for assign target tiled pixels to each PP, but mali
>>>>    450 has a HW module called DLBU to dynamically balance each
>>>>    PP's load.
>>>> - User space driver allocate buffer object and map into GPU
>>>>    virtual address space, upload command stream and draw data with
>>>>    CPU mmap of the buffer object, then submit task to GP/PP with
>>>>    a register frame indicating where is the command stream and misc
>>>>    settings.
>>>> - There's no command stream validation/relocation due to each user
>>>>    process has its own GPU virtual address space. GP/PP's MMU switch
>>>>    virtual address space before running two tasks from different
>>>>    user process. Error or evil user space code just get MMU fault
>>>>    or GP/PP error IRQ, then the HW/SW will be recovered.
>>>> - Use TTM as MM. TTM_PL_TT type memory is used as the content of
>>>>    lima buffer object which is allocated from TTM page pool. all
>>>>    lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when
>>>>    allocation, so there's no buffer eviction and swap for now. We
>>>>    need reverse engineering to see if and how GP/PP support MMU
>>>>    fault recovery (continue execution). Otherwise we have to
>>>>    pin/unpin each envolved buffer when task creation/deletion.
>>>
>>>
>>> Well pinning all memory is usually a no-go for upstreaming. But since you
>>> are already using the drm_sched for GPU task scheduling why are you actually
>>> needing this?
>>>
>>> The scheduler should take care of signaling all fences when the hardware is
>>> done with it's magic and that is enough for TTM to note that a buffer object
>>> is movable again (e.g. unpin them).
>>
>> Please correct me if I'm wrong.
>>
>> One way to implement eviction/swap is like this:
>> call validation on each buffers involved in a task, but this won't
>> prevent it from
>> eviction/swap when executing, so a GPU MMU fault may happen and in the
>> handler we need to recover the buffer evicted/swapped.
>>
>> Another way is pin/unpin buffers evolved when task create/free.
>>
>> First way is better when memory load is low and second way is better when
>> memory load is high. First way also need less memory.
>>
>> So I'd prefer first way but due to the GPU MMU fault
>> HW op need reverse engineering, I have to pin all buffers now. After
>> the HW op is clear, I can choose one way to implement.
>
> All the drivers using ttm have something that looks like vram, or a
> requirement to move buffers around. Afaiui that includes virtio drm
> driver.
Does virtio drm driver need to move buffers around? amdgpu also
has no vram when APU.

> From your description you don't have such a requirement, and
> then doing what etnaviv has done would be a lot simpler. Everything
> that's not related to buffer movement handling is also available
> outside of ttm already.
Yeah, I could do like etnaviv, but it's not simpler than using ttm
directly especially want some optimization (like ttm page pool,
ttm_eu_reserve_buffers, ttm_bo_mmap). If I have/want to implement
them, why not just use TTM directly with all those helper functions.

Regards,
Qiang

> -Daniel
>
>> Regards,
>> Qiang
>>
>>>
>>> Christian.
>>>
>>>
>>>> - Use drm_sched for GPU task schedule. Each OpenGL context should
>>>>    have a lima context object in the kernel to distinguish tasks
>>>>    from different user. drm_sched gets task from each lima context
>>>>    in a fair way.
>>>>
>>>> Not implemented:
>>>> - Dump buffer support
>>>> - Power management
>>>> - Performance counter
>>>>
>>>> This patch serial just pack a pair of .c/.h files in each patch.
>>>> For whole history of this driver's development, see:
>>>> https://github.com/yuq/linux-lima/commits/lima-4.17-rc4
>>>>
>>>> Mesa driver is still in development and not ready for daily usage,
>>>> but can run some simple tests like kmscube and glamrk2, see:
>>>> https://github.com/yuq/mesa-lima
>>>>
>>>> Andrei Paulau (1):
>>>>    arm64/dts: add switch-delay for meson mali
>>>>
>>>> Lima Project Developers (10):
>>>>    drm/lima: add mali 4xx GPU hardware regs
>>>>    drm/lima: add lima core driver
>>>>    drm/lima: add GPU device functions
>>>>    drm/lima: add PMU related functions
>>>>    drm/lima: add PP related functions
>>>>    drm/lima: add MMU related functions
>>>>    drm/lima: add GPU virtual memory space handing
>>>>    drm/lima: add GEM related functions
>>>>    drm/lima: add GEM Prime related functions
>>>>    drm/lima: add makefile and kconfig
>>>>
>>>> Qiang Yu (12):
>>>>    dt-bindings: add switch-delay property for mali-utgard
>>>>    arm64/dts: add switch-delay for meson mali
>>>>    Revert "drm: Nerf the preclose callback for modern drivers"
>>>>    drm/lima: add lima uapi header
>>>>    drm/lima: add L2 cache functions
>>>>    drm/lima: add GP related functions
>>>>    drm/lima: add BCAST related function
>>>>    drm/lima: add DLBU related functions
>>>>    drm/lima: add TTM subsystem functions
>>>>    drm/lima: add buffer object functions
>>>>    drm/lima: add GPU schedule using DRM_SCHED
>>>>    drm/lima: add context related functions
>>>>
>>>> Simon Shields (1):
>>>>    ARM: dts: add gpu node to exynos4
>>>>
>>>>   .../bindings/gpu/arm,mali-utgard.txt          |   4 +
>>>>   arch/arm/boot/dts/exynos4.dtsi                |  33 ++
>>>>   arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi   |   1 +
>>>>   .../boot/dts/amlogic/meson-gxl-mali.dtsi      |   1 +
>>>>   drivers/gpu/drm/Kconfig                       |   2 +
>>>>   drivers/gpu/drm/Makefile                      |   1 +
>>>>   drivers/gpu/drm/drm_file.c                    |   8 +-
>>>>   drivers/gpu/drm/lima/Kconfig                  |   9 +
>>>>   drivers/gpu/drm/lima/Makefile                 |  19 +
>>>>   drivers/gpu/drm/lima/lima_bcast.c             |  65 +++
>>>>   drivers/gpu/drm/lima/lima_bcast.h             |  34 ++
>>>>   drivers/gpu/drm/lima/lima_ctx.c               | 143 +++++
>>>>   drivers/gpu/drm/lima/lima_ctx.h               |  51 ++
>>>>   drivers/gpu/drm/lima/lima_device.c            | 407 ++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_device.h            | 136 +++++
>>>>   drivers/gpu/drm/lima/lima_dlbu.c              |  75 +++
>>>>   drivers/gpu/drm/lima/lima_dlbu.h              |  37 ++
>>>>   drivers/gpu/drm/lima/lima_drv.c               | 466 ++++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_drv.h               |  77 +++
>>>>   drivers/gpu/drm/lima/lima_gem.c               | 459 ++++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_gem.h               |  41 ++
>>>>   drivers/gpu/drm/lima/lima_gem_prime.c         |  66 +++
>>>>   drivers/gpu/drm/lima/lima_gem_prime.h         |  31 ++
>>>>   drivers/gpu/drm/lima/lima_gp.c                | 293 +++++++++++
>>>>   drivers/gpu/drm/lima/lima_gp.h                |  34 ++
>>>>   drivers/gpu/drm/lima/lima_l2_cache.c          |  98 ++++
>>>>   drivers/gpu/drm/lima/lima_l2_cache.h          |  32 ++
>>>>   drivers/gpu/drm/lima/lima_mmu.c               | 154 ++++++
>>>>   drivers/gpu/drm/lima/lima_mmu.h               |  34 ++
>>>>   drivers/gpu/drm/lima/lima_object.c            | 120 +++++
>>>>   drivers/gpu/drm/lima/lima_object.h            |  87 +++
>>>>   drivers/gpu/drm/lima/lima_pmu.c               |  85 +++
>>>>   drivers/gpu/drm/lima/lima_pmu.h               |  30 ++
>>>>   drivers/gpu/drm/lima/lima_pp.c                | 418 +++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_pp.h                |  37 ++
>>>>   drivers/gpu/drm/lima/lima_regs.h              | 304 +++++++++++
>>>>   drivers/gpu/drm/lima/lima_sched.c             | 497 ++++++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_sched.h             | 126 +++++
>>>>   drivers/gpu/drm/lima/lima_ttm.c               | 409 ++++++++++++++
>>>>   drivers/gpu/drm/lima/lima_ttm.h               |  44 ++
>>>>   drivers/gpu/drm/lima/lima_vm.c                | 312 +++++++++++
>>>>   drivers/gpu/drm/lima/lima_vm.h                |  73 +++
>>>>   include/drm/drm_drv.h                         |  23 +-
>>>>   include/uapi/drm/lima_drm.h                   | 195 +++++++
>>>>   44 files changed, 5565 insertions(+), 6 deletions(-)
>>>>   create mode 100644 drivers/gpu/drm/lima/Kconfig
>>>>   create mode 100644 drivers/gpu/drm/lima/Makefile
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_bcast.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_bcast.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_ctx.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_ctx.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_device.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_device.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_drv.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_drv.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gem.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gem.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gp.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_gp.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_mmu.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_mmu.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_object.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_object.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_pmu.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_pmu.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_pp.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_pp.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_regs.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_sched.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_sched.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_ttm.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_ttm.h
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_vm.c
>>>>   create mode 100644 drivers/gpu/drm/lima/lima_vm.h
>>>>   create mode 100644 include/uapi/drm/lima_drm.h
>>>>
>>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@xxxxxxxxxxxxxxxxxxxxx
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>
>
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel