On Wed, May 23, 2018 at 11:44 PM, Daniel Vetter <daniel@xxxxxxxx> wrote: > On Wed, May 23, 2018 at 3:52 PM, Qiang Yu <yuq825@xxxxxxxxx> wrote: >> On Wed, May 23, 2018 at 5:29 PM, Christian König >> <ckoenig.leichtzumerken@xxxxxxxxx> wrote: >>> Am 18.05.2018 um 11:27 schrieb Qiang Yu: >>>> >>>> Kernel DRM driver for ARM Mali 400/450 GPUs. >>>> >>>> This implementation mainly take amdgpu DRM driver as reference. >>>> >>>> - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for >>>> OpenGL vertex shader processing and PP is for fragment shader >>>> processing. Each processor has its own MMU so prcessors work in >>>> virtual address space. >>>> - There's only one GP but multiple PP (max 4 for mali 400 and 8 >>>> for mali 450) in the same mali 4xx GPU. All PPs are grouped >>>> togather to handle a single fragment shader task divided by >>>> FB output tiled pixels. Mali 400 user space driver is >>>> responsible for assign target tiled pixels to each PP, but mali >>>> 450 has a HW module called DLBU to dynamically balance each >>>> PP's load. >>>> - User space driver allocate buffer object and map into GPU >>>> virtual address space, upload command stream and draw data with >>>> CPU mmap of the buffer object, then submit task to GP/PP with >>>> a register frame indicating where is the command stream and misc >>>> settings. >>>> - There's no command stream validation/relocation due to each user >>>> process has its own GPU virtual address space. GP/PP's MMU switch >>>> virtual address space before running two tasks from different >>>> user process. Error or evil user space code just get MMU fault >>>> or GP/PP error IRQ, then the HW/SW will be recovered. >>>> - Use TTM as MM. TTM_PL_TT type memory is used as the content of >>>> lima buffer object which is allocated from TTM page pool. all >>>> lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when >>>> allocation, so there's no buffer eviction and swap for now. We >>>> need reverse engineering to see if and how GP/PP support MMU >>>> fault recovery (continue execution). Otherwise we have to >>>> pin/unpin each envolved buffer when task creation/deletion. >>> >>> >>> Well pinning all memory is usually a no-go for upstreaming. But since you >>> are already using the drm_sched for GPU task scheduling why are you actually >>> needing this? >>> >>> The scheduler should take care of signaling all fences when the hardware is >>> done with it's magic and that is enough for TTM to note that a buffer object >>> is movable again (e.g. unpin them). >> >> Please correct me if I'm wrong. >> >> One way to implement eviction/swap is like this: >> call validation on each buffers involved in a task, but this won't >> prevent it from >> eviction/swap when executing, so a GPU MMU fault may happen and in the >> handler we need to recover the buffer evicted/swapped. >> >> Another way is pin/unpin buffers evolved when task create/free. >> >> First way is better when memory load is low and second way is better when >> memory load is high. First way also need less memory. >> >> So I'd prefer first way but due to the GPU MMU fault >> HW op need reverse engineering, I have to pin all buffers now. After >> the HW op is clear, I can choose one way to implement. > > All the drivers using ttm have something that looks like vram, or a > requirement to move buffers around. Afaiui that includes virtio drm > driver. Does virtio drm driver need to move buffers around? amdgpu also has no vram when APU. > From your description you don't have such a requirement, and > then doing what etnaviv has done would be a lot simpler. Everything > that's not related to buffer movement handling is also available > outside of ttm already. Yeah, I could do like etnaviv, but it's not simpler than using ttm directly especially want some optimization (like ttm page pool, ttm_eu_reserve_buffers, ttm_bo_mmap). If I have/want to implement them, why not just use TTM directly with all those helper functions. Regards, Qiang > -Daniel > >> Regards, >> Qiang >> >>> >>> Christian. >>> >>> >>>> - Use drm_sched for GPU task schedule. Each OpenGL context should >>>> have a lima context object in the kernel to distinguish tasks >>>> from different user. drm_sched gets task from each lima context >>>> in a fair way. >>>> >>>> Not implemented: >>>> - Dump buffer support >>>> - Power management >>>> - Performance counter >>>> >>>> This patch serial just pack a pair of .c/.h files in each patch. >>>> For whole history of this driver's development, see: >>>> https://github.com/yuq/linux-lima/commits/lima-4.17-rc4 >>>> >>>> Mesa driver is still in development and not ready for daily usage, >>>> but can run some simple tests like kmscube and glamrk2, see: >>>> https://github.com/yuq/mesa-lima >>>> >>>> Andrei Paulau (1): >>>> arm64/dts: add switch-delay for meson mali >>>> >>>> Lima Project Developers (10): >>>> drm/lima: add mali 4xx GPU hardware regs >>>> drm/lima: add lima core driver >>>> drm/lima: add GPU device functions >>>> drm/lima: add PMU related functions >>>> drm/lima: add PP related functions >>>> drm/lima: add MMU related functions >>>> drm/lima: add GPU virtual memory space handing >>>> drm/lima: add GEM related functions >>>> drm/lima: add GEM Prime related functions >>>> drm/lima: add makefile and kconfig >>>> >>>> Qiang Yu (12): >>>> dt-bindings: add switch-delay property for mali-utgard >>>> arm64/dts: add switch-delay for meson mali >>>> Revert "drm: Nerf the preclose callback for modern drivers" >>>> drm/lima: add lima uapi header >>>> drm/lima: add L2 cache functions >>>> drm/lima: add GP related functions >>>> drm/lima: add BCAST related function >>>> drm/lima: add DLBU related functions >>>> drm/lima: add TTM subsystem functions >>>> drm/lima: add buffer object functions >>>> drm/lima: add GPU schedule using DRM_SCHED >>>> drm/lima: add context related functions >>>> >>>> Simon Shields (1): >>>> ARM: dts: add gpu node to exynos4 >>>> >>>> .../bindings/gpu/arm,mali-utgard.txt | 4 + >>>> arch/arm/boot/dts/exynos4.dtsi | 33 ++ >>>> arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 1 + >>>> .../boot/dts/amlogic/meson-gxl-mali.dtsi | 1 + >>>> drivers/gpu/drm/Kconfig | 2 + >>>> drivers/gpu/drm/Makefile | 1 + >>>> drivers/gpu/drm/drm_file.c | 8 +- >>>> drivers/gpu/drm/lima/Kconfig | 9 + >>>> drivers/gpu/drm/lima/Makefile | 19 + >>>> drivers/gpu/drm/lima/lima_bcast.c | 65 +++ >>>> drivers/gpu/drm/lima/lima_bcast.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_ctx.c | 143 +++++ >>>> drivers/gpu/drm/lima/lima_ctx.h | 51 ++ >>>> drivers/gpu/drm/lima/lima_device.c | 407 ++++++++++++++ >>>> drivers/gpu/drm/lima/lima_device.h | 136 +++++ >>>> drivers/gpu/drm/lima/lima_dlbu.c | 75 +++ >>>> drivers/gpu/drm/lima/lima_dlbu.h | 37 ++ >>>> drivers/gpu/drm/lima/lima_drv.c | 466 ++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_drv.h | 77 +++ >>>> drivers/gpu/drm/lima/lima_gem.c | 459 ++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_gem.h | 41 ++ >>>> drivers/gpu/drm/lima/lima_gem_prime.c | 66 +++ >>>> drivers/gpu/drm/lima/lima_gem_prime.h | 31 ++ >>>> drivers/gpu/drm/lima/lima_gp.c | 293 +++++++++++ >>>> drivers/gpu/drm/lima/lima_gp.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_l2_cache.c | 98 ++++ >>>> drivers/gpu/drm/lima/lima_l2_cache.h | 32 ++ >>>> drivers/gpu/drm/lima/lima_mmu.c | 154 ++++++ >>>> drivers/gpu/drm/lima/lima_mmu.h | 34 ++ >>>> drivers/gpu/drm/lima/lima_object.c | 120 +++++ >>>> drivers/gpu/drm/lima/lima_object.h | 87 +++ >>>> drivers/gpu/drm/lima/lima_pmu.c | 85 +++ >>>> drivers/gpu/drm/lima/lima_pmu.h | 30 ++ >>>> drivers/gpu/drm/lima/lima_pp.c | 418 +++++++++++++++ >>>> drivers/gpu/drm/lima/lima_pp.h | 37 ++ >>>> drivers/gpu/drm/lima/lima_regs.h | 304 +++++++++++ >>>> drivers/gpu/drm/lima/lima_sched.c | 497 ++++++++++++++++++ >>>> drivers/gpu/drm/lima/lima_sched.h | 126 +++++ >>>> drivers/gpu/drm/lima/lima_ttm.c | 409 ++++++++++++++ >>>> drivers/gpu/drm/lima/lima_ttm.h | 44 ++ >>>> drivers/gpu/drm/lima/lima_vm.c | 312 +++++++++++ >>>> drivers/gpu/drm/lima/lima_vm.h | 73 +++ >>>> include/drm/drm_drv.h | 23 +- >>>> include/uapi/drm/lima_drm.h | 195 +++++++ >>>> 44 files changed, 5565 insertions(+), 6 deletions(-) >>>> create mode 100644 drivers/gpu/drm/lima/Kconfig >>>> create mode 100644 drivers/gpu/drm/lima/Makefile >>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_device.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_device.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_object.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_object.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_regs.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.h >>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.c >>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.h >>>> create mode 100644 include/uapi/drm/lima_drm.h >>>> >>> >> _______________________________________________ >> dri-devel mailing list >> dri-devel@xxxxxxxxxxxxxxxxxxxxx >> https://lists.freedesktop.org/mailman/listinfo/dri-devel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch -- To unsubscribe from this list: send the line "unsubscribe devicetree" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html