On Wed, May 23, 2018 at 9:59 PM, Christian König <christian.koenig@xxxxxxx> wrote: > Am 23.05.2018 um 15:52 schrieb Qiang Yu: >> >> On Wed, May 23, 2018 at 5:29 PM, Christian König >> <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >>> >>> Am 18.05.2018 um 11:27 schrieb Qiang Yu: >>>> >>>> Kernel DRM driver for ARM Mali 400/450 GPUs. >>>> >>>> This implementation mainly take amdgpu DRM driver as reference. >>>> >>>> - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for >>>> OpenGL vertex shader processing and PP is for fragment shader >>>> processing. Each processor has its own MMU so prcessors work in >>>> virtual address space. >>>> - There's only one GP but multiple PP (max 4 for mali 400 and 8 >>>> for mali 450) in the same mali 4xx GPU. All PPs are grouped >>>> togather to handle a single fragment shader task divided by >>>> FB output tiled pixels. Mali 400 user space driver is >>>> responsible for assign target tiled pixels to each PP, but mali >>>> 450 has a HW module called DLBU to dynamically balance each >>>> PP's load. >>>> - User space driver allocate buffer object and map into GPU >>>> virtual address space, upload command stream and draw data with >>>> CPU mmap of the buffer object, then submit task to GP/PP with >>>> a register frame indicating where is the command stream and misc >>>> settings. >>>> - There's no command stream validation/relocation due to each user >>>> process has its own GPU virtual address space. GP/PP's MMU switch >>>> virtual address space before running two tasks from different >>>> user process. Error or evil user space code just get MMU fault >>>> or GP/PP error IRQ, then the HW/SW will be recovered. >>>> - Use TTM as MM. TTM_PL_TT type memory is used as the content of >>>> lima buffer object which is allocated from TTM page pool. all >>>> lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when >>>> allocation, so there's no buffer eviction and swap for now. We >>>> need reverse engineering to see if and how GP/PP support MMU >>>> fault recovery (continue execution). Otherwise we have to >>>> pin/unpin each envolved buffer when task creation/deletion. >>> >>> >>> Well pinning all memory is usually a no-go for upstreaming. But since you >>> are already using the drm_sched for GPU task scheduling why are you >>> actually >>> needing this? >>> >>> The scheduler should take care of signaling all fences when the hardware >>> is >>> done with it's magic and that is enough for TTM to note that a buffer >>> object >>> is movable again (e.g. unpin them). >> >> Please correct me if I'm wrong. > > > Well, you are wrong :) > >> One way to implement eviction/swap is like this: >> call validation on each buffers involved in a task, but this won't >> prevent it from >> eviction/swap when executing, so a GPU MMU fault may happen and in the >> handler we need to recover the buffer evicted/swapped. >> >> Another way is pin/unpin buffers evolved when task create/free. >> >> First way is better when memory load is low and second way is better when >> memory load is high. First way also need less memory. >> >> So I'd prefer first way but due to the GPU MMU fault >> HW op need reverse engineering, I have to pin all buffers now. After >> the HW op is clear, I can choose one way to implement. > > > The general approach is: > 1.) Lock all BOs > 2.) Validate all BOs > 3.) Add the fence > 4.) Unlock the BOs This is the task prepare process, right? > > BOs can't be evicted while they are locked During the task prepare stage, they're locked, but after task queued, they get unlocked and be evictable? > and since you already add the > fence that should be perfectly sufficient to prevent it from being evicted > until your operation is completed. You mean I have to explicitly pin it with TTM_PL_FLAG_NO_EVICT when task creation or TTM will check buffer's reservation object and won't evict it if see a fence? Regards, Qiang > > Using the MMU is certainly be better in general, but usually only optional > and a pain in the ass to get working. We have that in amdgpu for quite a > while as well now and still don't use it because of that. > > Regards, > Christian. > > >> >> Regards, >> Qiang >> >>> Christian. >>> >>> >>>> - Use drm_sched for GPU task schedule. Each OpenGL context should >>>> have a lima context object in the kernel to distinguish tasks >>>> from different user. drm_sched gets task from each lima context >>>> in a fair way. >>>> >>>> Not implemented: >>>> - Dump buffer support >>>> - Power management >>>> - Performance counter >>>> >>>> This patch serial just pack a pair of .c/.h files in each patch. >>>> For whole history of this driver's development, see: >>>> https://github.com/yuq/linux-lima/commits/lima-4.17-rc4 >>>> >>>> Mesa driver is still in development and not ready for daily usage, >>>> but can run some simple tests like kmscube and glamrk2, see: >>>> https://github.com/yuq/mesa-lima >>>> >>>> Andrei Paulau (1): >>>> arm64/dts: add switch-delay for meson mali >>>> >>>> Lima Project Developers (10): >>>> drm/lima: add mali 4xx GPU hardware regs >>>> drm/lima: add lima core driver >>>> drm/lima: add GPU device functions >>>> drm/lima: add PMU related functions >>>> drm/lima: add PP related functions >>>> drm/lima: add MMU related functions >>>> drm/lima: add GPU virtual memory space handing >>>> drm/lima: add GEM related functions >>>> drm/lima: add GEM Prime related functions >>>> drm/lima: add makefile and kconfig >>>> >>>> Qiang Yu (12): >>>> dt-bindings: add switch-delay property for mali-utgard >>>> arm64/dts: add switch-delay for meson mali >>>> Revert "drm: Nerf the preclose callback for modern drivers" >>>> drm/lima: add lima uapi header >>>> drm/lima: add L2 cache functions >>>> drm/lima: add GP related functions >>>> drm/lima: add BCAST related function >>>> drm/lima: add DLBU related functions >>>> drm/lima: add TTM subsystem functions >>>> drm/lima: add buffer object functions >>>> drm/lima: add GPU schedule using DRM_SCHED >>>> drm/lima: add context related functions >>>> >>>> Simon Shields (1): >>>> ARM: dts: add gpu node to exynos4 >>>> >>>> .../bindings/gpu/arm,mali-utgard.txt | 4 + >>>> arch/arm/boot/dts/exynos4.dtsi | 33 ++ >>>> arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 1 + >>>> .../boot/dts/amlogic/meson-gxl-mali.dtsi | 1 + >>>> drivers/gpu/drm/Kconfig | 2 + >>>> drivers/gpu/drm/Makefile | 1 + >>>> drivers/gpu/drm/drm_file.c | 8 +- >>>> drivers/gpu/drm/lima/Kconfig | 9 + >>>> drivers/gpu/drm/lima/Makefile | 19 + >>>> drivers/gpu/drm/lima/lima_bcast.c | 65 +++ >>>> drivers/gpu/drm/lima/lima_bcast.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_ctx.c | 143 +++++ >>>> drivers/gpu/drm/lima/lima_ctx.h | 51 ++ >>>> drivers/gpu/drm/lima/lima_device.c | 407 ++++++++++++++ >>>> drivers/gpu/drm/lima/lima_device.h | 136 +++++ >>>> drivers/gpu/drm/lima/lima_dlbu.c | 75 +++ >>>> drivers/gpu/drm/lima/lima_dlbu.h | 37 ++ >>>> drivers/gpu/drm/lima/lima_drv.c | 466 ++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_drv.h | 77 +++ >>>> drivers/gpu/drm/lima/lima_gem.c | 459 ++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_gem.h | 41 ++ >>>> drivers/gpu/drm/lima/lima_gem_prime.c | 66 +++ >>>> drivers/gpu/drm/lima/lima_gem_prime.h | 31 ++ >>>> drivers/gpu/drm/lima/lima_gp.c | 293 +++++++++++ >>>> drivers/gpu/drm/lima/lima_gp.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_l2_cache.c | 98 ++++ >>>> drivers/gpu/drm/lima/lima_l2_cache.h | 32 ++ >>>> drivers/gpu/drm/lima/lima_mmu.c | 154 ++++++ >>>> drivers/gpu/drm/lima/lima_mmu.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_object.c | 120 +++++ >>>> drivers/gpu/drm/lima/lima_object.h | 87 +++ >>>> drivers/gpu/drm/lima/lima_pmu.c | 85 +++ >>>> drivers/gpu/drm/lima/lima_pmu.h | 30 ++ >>>> drivers/gpu/drm/lima/lima_pp.c | 418 +++++++++++++++ >>>> drivers/gpu/drm/lima/lima_pp.h | 37 ++ >>>> drivers/gpu/drm/lima/lima_regs.h | 304 +++++++++++ >>>> drivers/gpu/drm/lima/lima_sched.c | 497 >>>> ++++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_sched.h | 126 +++++ >>>> drivers/gpu/drm/lima/lima_ttm.c | 409 ++++++++++++++ >>>> drivers/gpu/drm/lima/lima_ttm.h | 44 ++ >>>> drivers/gpu/drm/lima/lima_vm.c | 312 +++++++++++ >>>> drivers/gpu/drm/lima/lima_vm.h | 73 +++ >>>> include/drm/drm_drv.h | 23 +- >>>> include/uapi/drm/lima_drm.h | 195 +++++++ >>>> 44 files changed, 5565 insertions(+), 6 deletions(-) >>>> create mode 100644 drivers/gpu/drm/lima/Kconfig >>>> create mode 100644 drivers/gpu/drm/lima/Makefile >>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_device.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_device.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_object.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_object.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_regs.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.h >>>> create mode 100644 include/uapi/drm/lima_drm.h >>>> > -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html