Re: [PATCH RFC 00/24] Lima DRM driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 23, 2018 at 9:59 PM, Christian König
<christian.koenig@xxxxxxx> wrote:
> Am 23.05.2018 um 15:52 schrieb Qiang Yu:
>>
>> On Wed, May 23, 2018 at 5:29 PM, Christian König
>> <ckoenig.leichtzumerken@xxxxxxxxx> wrote:
>>>
>>> Am 18.05.2018 um 11:27 schrieb Qiang Yu:
>>>>
>>>> Kernel DRM driver for ARM Mali 400/450 GPUs.
>>>>
>>>> This implementation mainly take amdgpu DRM driver as reference.
>>>>
>>>> - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
>>>>     OpenGL vertex shader processing and PP is for fragment shader
>>>>     processing. Each processor has its own MMU so prcessors work in
>>>>     virtual address space.
>>>> - There's only one GP but multiple PP (max 4 for mali 400 and 8
>>>>     for mali 450) in the same mali 4xx GPU. All PPs are grouped
>>>>     togather to handle a single fragment shader task divided by
>>>>     FB output tiled pixels. Mali 400 user space driver is
>>>>     responsible for assign target tiled pixels to each PP, but mali
>>>>     450 has a HW module called DLBU to dynamically balance each
>>>>     PP's load.
>>>> - User space driver allocate buffer object and map into GPU
>>>>     virtual address space, upload command stream and draw data with
>>>>     CPU mmap of the buffer object, then submit task to GP/PP with
>>>>     a register frame indicating where is the command stream and misc
>>>>     settings.
>>>> - There's no command stream validation/relocation due to each user
>>>>     process has its own GPU virtual address space. GP/PP's MMU switch
>>>>     virtual address space before running two tasks from different
>>>>     user process. Error or evil user space code just get MMU fault
>>>>     or GP/PP error IRQ, then the HW/SW will be recovered.
>>>> - Use TTM as MM. TTM_PL_TT type memory is used as the content of
>>>>     lima buffer object which is allocated from TTM page pool. all
>>>>     lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when
>>>>     allocation, so there's no buffer eviction and swap for now. We
>>>>     need reverse engineering to see if and how GP/PP support MMU
>>>>     fault recovery (continue execution). Otherwise we have to
>>>>     pin/unpin each envolved buffer when task creation/deletion.
>>>
>>>
>>> Well pinning all memory is usually a no-go for upstreaming. But since you
>>> are already using the drm_sched for GPU task scheduling why are you
>>> actually
>>> needing this?
>>>
>>> The scheduler should take care of signaling all fences when the hardware
>>> is
>>> done with it's magic and that is enough for TTM to note that a buffer
>>> object
>>> is movable again (e.g. unpin them).
>>
>> Please correct me if I'm wrong.
>
>
> Well, you are wrong :)
>
>> One way to implement eviction/swap is like this:
>> call validation on each buffers involved in a task, but this won't
>> prevent it from
>> eviction/swap when executing, so a GPU MMU fault may happen and in the
>> handler we need to recover the buffer evicted/swapped.
>>
>> Another way is pin/unpin buffers evolved when task create/free.
>>
>> First way is better when memory load is low and second way is better when
>> memory load is high. First way also need less memory.
>>
>> So I'd prefer first way but due to the GPU MMU fault
>> HW op need reverse engineering, I have to pin all buffers now. After
>> the HW op is clear, I can choose one way to implement.
>
>
> The general approach is:
> 1.) Lock all BOs
> 2.) Validate all BOs
> 3.) Add the fence
> 4.) Unlock the BOs
This is the task prepare process, right?

>
> BOs can't be evicted while they are locked
During the task prepare stage, they're locked, but after task queued, they
get unlocked and be evictable?

> and since you already add the
> fence that should be perfectly sufficient to prevent it from being evicted
> until your operation is completed.
You mean I have to explicitly pin it with TTM_PL_FLAG_NO_EVICT
when task creation or TTM will check buffer's reservation object and
won't evict it if see a fence?

Regards,
Qiang

>
> Using the MMU is certainly be better in general, but usually only optional
> and a pain in the ass to get working. We have that in amdgpu for quite a
> while as well now and still don't use it because of that.
>
> Regards,
> Christian.
>
>
>>
>> Regards,
>> Qiang
>>
>>> Christian.
>>>
>>>
>>>> - Use drm_sched for GPU task schedule. Each OpenGL context should
>>>>     have a lima context object in the kernel to distinguish tasks
>>>>     from different user. drm_sched gets task from each lima context
>>>>     in a fair way.
>>>>
>>>> Not implemented:
>>>> - Dump buffer support
>>>> - Power management
>>>> - Performance counter
>>>>
>>>> This patch serial just pack a pair of .c/.h files in each patch.
>>>> For whole history of this driver's development, see:
>>>> https://github.com/yuq/linux-lima/commits/lima-4.17-rc4
>>>>
>>>> Mesa driver is still in development and not ready for daily usage,
>>>> but can run some simple tests like kmscube and glamrk2, see:
>>>> https://github.com/yuq/mesa-lima
>>>>
>>>> Andrei Paulau (1):
>>>>     arm64/dts: add switch-delay for meson mali
>>>>
>>>> Lima Project Developers (10):
>>>>     drm/lima: add mali 4xx GPU hardware regs
>>>>     drm/lima: add lima core driver
>>>>     drm/lima: add GPU device functions
>>>>     drm/lima: add PMU related functions
>>>>     drm/lima: add PP related functions
>>>>     drm/lima: add MMU related functions
>>>>     drm/lima: add GPU virtual memory space handing
>>>>     drm/lima: add GEM related functions
>>>>     drm/lima: add GEM Prime related functions
>>>>     drm/lima: add makefile and kconfig
>>>>
>>>> Qiang Yu (12):
>>>>     dt-bindings: add switch-delay property for mali-utgard
>>>>     arm64/dts: add switch-delay for meson mali
>>>>     Revert "drm: Nerf the preclose callback for modern drivers"
>>>>     drm/lima: add lima uapi header
>>>>     drm/lima: add L2 cache functions
>>>>     drm/lima: add GP related functions
>>>>     drm/lima: add BCAST related function
>>>>     drm/lima: add DLBU related functions
>>>>     drm/lima: add TTM subsystem functions
>>>>     drm/lima: add buffer object functions
>>>>     drm/lima: add GPU schedule using DRM_SCHED
>>>>     drm/lima: add context related functions
>>>>
>>>> Simon Shields (1):
>>>>     ARM: dts: add gpu node to exynos4
>>>>
>>>>    .../bindings/gpu/arm,mali-utgard.txt          |   4 +
>>>>    arch/arm/boot/dts/exynos4.dtsi                |  33 ++
>>>>    arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi   |   1 +
>>>>    .../boot/dts/amlogic/meson-gxl-mali.dtsi      |   1 +
>>>>    drivers/gpu/drm/Kconfig                       |   2 +
>>>>    drivers/gpu/drm/Makefile                      |   1 +
>>>>    drivers/gpu/drm/drm_file.c                    |   8 +-
>>>>    drivers/gpu/drm/lima/Kconfig                  |   9 +
>>>>    drivers/gpu/drm/lima/Makefile                 |  19 +
>>>>    drivers/gpu/drm/lima/lima_bcast.c             |  65 +++
>>>>    drivers/gpu/drm/lima/lima_bcast.h             |  34 ++
>>>>    drivers/gpu/drm/lima/lima_ctx.c               | 143 +++++
>>>>    drivers/gpu/drm/lima/lima_ctx.h               |  51 ++
>>>>    drivers/gpu/drm/lima/lima_device.c            | 407 ++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_device.h            | 136 +++++
>>>>    drivers/gpu/drm/lima/lima_dlbu.c              |  75 +++
>>>>    drivers/gpu/drm/lima/lima_dlbu.h              |  37 ++
>>>>    drivers/gpu/drm/lima/lima_drv.c               | 466 ++++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_drv.h               |  77 +++
>>>>    drivers/gpu/drm/lima/lima_gem.c               | 459 ++++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_gem.h               |  41 ++
>>>>    drivers/gpu/drm/lima/lima_gem_prime.c         |  66 +++
>>>>    drivers/gpu/drm/lima/lima_gem_prime.h         |  31 ++
>>>>    drivers/gpu/drm/lima/lima_gp.c                | 293 +++++++++++
>>>>    drivers/gpu/drm/lima/lima_gp.h                |  34 ++
>>>>    drivers/gpu/drm/lima/lima_l2_cache.c          |  98 ++++
>>>>    drivers/gpu/drm/lima/lima_l2_cache.h          |  32 ++
>>>>    drivers/gpu/drm/lima/lima_mmu.c               | 154 ++++++
>>>>    drivers/gpu/drm/lima/lima_mmu.h               |  34 ++
>>>>    drivers/gpu/drm/lima/lima_object.c            | 120 +++++
>>>>    drivers/gpu/drm/lima/lima_object.h            |  87 +++
>>>>    drivers/gpu/drm/lima/lima_pmu.c               |  85 +++
>>>>    drivers/gpu/drm/lima/lima_pmu.h               |  30 ++
>>>>    drivers/gpu/drm/lima/lima_pp.c                | 418 +++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_pp.h                |  37 ++
>>>>    drivers/gpu/drm/lima/lima_regs.h              | 304 +++++++++++
>>>>    drivers/gpu/drm/lima/lima_sched.c             | 497
>>>> ++++++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_sched.h             | 126 +++++
>>>>    drivers/gpu/drm/lima/lima_ttm.c               | 409 ++++++++++++++
>>>>    drivers/gpu/drm/lima/lima_ttm.h               |  44 ++
>>>>    drivers/gpu/drm/lima/lima_vm.c                | 312 +++++++++++
>>>>    drivers/gpu/drm/lima/lima_vm.h                |  73 +++
>>>>    include/drm/drm_drv.h                         |  23 +-
>>>>    include/uapi/drm/lima_drm.h                   | 195 +++++++
>>>>    44 files changed, 5565 insertions(+), 6 deletions(-)
>>>>    create mode 100644 drivers/gpu/drm/lima/Kconfig
>>>>    create mode 100644 drivers/gpu/drm/lima/Makefile
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_bcast.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_bcast.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_ctx.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_ctx.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_device.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_device.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_drv.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_drv.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gem.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gem.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gp.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_gp.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_mmu.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_mmu.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_object.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_object.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_pmu.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_pmu.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_pp.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_pp.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_regs.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_sched.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_sched.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_ttm.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_ttm.h
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_vm.c
>>>>    create mode 100644 drivers/gpu/drm/lima/lima_vm.h
>>>>    create mode 100644 include/uapi/drm/lima_drm.h
>>>>
>
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux