On Tue, Jul 30, 2019 at 12:24 PM Koenig, Christian <Christian.Koenig@xxxxxxx> wrote: > > Am 30.07.19 um 11:38 schrieb Daniel Vetter: > > On Tue, Jul 30, 2019 at 08:45:57AM +0000, Koenig, Christian wrote: > >> Yeah, that looks like a good start. Just a couple of random design > >> comments/requirements. > >> > >> First of all please restructure the changes so that you more or less > >> have the following: > >> 1. Adding of the new structures and functionality without any change to > >> existing code. > >> 2. Replacing the existing functionality in TTM and all of its drivers. > >> 3. Replacing the existing functionality in i915. > >> > >> This should make it much easier to review the new functionality when it > >> is not mixed with existing TTM stuff. > >> > >> > >> Second please completely drop the concept of gpu_offset or start of the > >> memory region like here: > >>> drm_printf(p, " gpu_offset: 0x%08llX\n", man->region.start); > >> At least on AMD hardware we have the following address spaces which are > >> sometimes even partially overlapping: VM, MC, SYSTEM, FB, AGP, XGMI, bus > >> addresses and physical addresses. > >> > >> Pushing a concept of a general GPU address space into the memory > >> management was a rather bad design mistake in TTM and we should not > >> repeat that here. > >> > >> A region should only consists of a size in bytes and (internal to the > >> region manager) allocations in that region. > >> > >> > >> Third please don't use any CPU or architecture specific types in any > >> data structures: > >>> +struct drm_mem_region { > >>> + resource_size_t start; /* within GPU physical address space */ > >>> + resource_size_t io_start; /* BAR address (CPU accessible) */ > >>> + resource_size_t size; > >> I knew that resource_size is mostly 64bit on modern architectures, but > >> dGPUs are completely separate to the architecture and we always need > >> 64bits here at least for AMD hardware. > >> > >> So this should either be always uint64_t, or something like > >> gpu_resource_size which depends on what the compiled in drivers require > >> if we really need that. > >> > >> And by the way: Please always use bytes for things like sizes and not > >> number of pages, cause page size is again CPU/architecture specific and > >> GPU drivers don't necessary care about that. > >> > >> > >> And here also a few direct comments on the code: > >>> + union { > >>> + struct drm_mm *mm; > >>> + /* FIXME (for i915): struct drm_buddy_mm *buddy_mm; */ > >>> + void *priv; > >>> + }; > >> Maybe just always use void *mm here. > >> > >>> + spinlock_t move_lock; > >>> + struct dma_fence *move; > >> That is TTM specific and I'm not sure if we want it in the common memory > >> management handling. > >> > >> If we want that here we should probably replace the lock with some rcu > >> and atomic fence pointer exchange first. > >> > >>> +/* > >>> + * Memory types for drm_mem_region > >>> + */ > >> #define DRM_MEM_SWAP ? > > btw what did you have in mind for this? Since we use shmem we kinda don't > > know whether the BO is actually swapped out or not, at least on the i915 > > side. So this would be more NOT_CURRENTLY_PINNED_AND_POSSIBLY_SWAPPED_OUT. > > Yeah, the problem is not everybody can use shmem. For some use cases you > have to use memory allocated through dma_alloc_coherent(). > > So to be able to swap this out you need a separate domain to copy it > from whatever is backing it currently to shmem. > > So we essentially have: > DRM_MEM_SYS_SWAPABLE > DRM_MEM_SYS_NOT_GPU_MAPPED > DRM_MEM_SYS_GPU_MAPPED > > Or something like that. Yeah i915-gem is similar. We oportunistically keep the pages pinned sometimes even if not currently mapped into the (what ttm calls) TT. So I think these three for system memory make sense for us too. I think that's similar (at least in spirit) to the dma_alloc cache you have going on. Mabye instead of the somewhat cumbersome NOT_GPU_MAPPED we could have something like PINNED or so. Although it's not permanently pinned, so maybe that's confusing too. > >> TTM was clearly missing that resulting in a whole bunch of extra > >> handling and rather complicated handling. > >> > >>> +#define DRM_MEM_SYSTEM 0 > >>> +#define DRM_MEM_STOLEN 1 > >> I think we need a better naming for that. > >> > >> STOLEN sounds way to much like stolen VRAM for integrated GPUs, but at > >> least for TTM this is the system memory currently GPU accessible. > > Yup this is wrong, for i915 we use this as stolen, for ttm it's the gpu > > translation table window into system memory. Not the same thing at all. > > Thought so. The closest I have in mind is GTT, but everything else works > as well. Would your GPU_MAPPED above work for TT? I think we'll also need STOLEN, I'm even hearing noises that there's going to be stolen for discrete vram for us ... Also if we expand I guess we need to teach ttm to cope with more, or maybe treat the DRM one as some kind of sub-flavour. -Daniel > > Christian. > > > -Daniel > > > >> > >> Thanks for looking into that, > >> Christian. > >> > >> Am 30.07.19 um 02:32 schrieb Brian Welty: > >>> [ By request, resending to include amd-gfx + intel-gfx. Since resending, > >>> I fixed the nit with ordering of header includes that Sam noted. ] > >>> > >>> This RFC series is first implementation of some ideas expressed > >>> earlier on dri-devel [1]. > >>> > >>> Some of the goals (open for much debate) are: > >>> - Create common base structure (subclass) for memory regions (patch #1) > >>> - Create common memory region types (patch #2) > >>> - Create common set of memory_region function callbacks (based on > >>> ttm_mem_type_manager_funcs and intel_memory_regions_ops) > >>> - Create common helpers that operate on drm_mem_region to be leveraged > >>> by both TTM drivers and i915, reducing code duplication > >>> - Above might start with refactoring ttm_bo_manager.c as these are > >>> helpers for using drm_mm's range allocator and could be made to > >>> operate on DRM structures instead of TTM ones. > >>> - Larger goal might be to make LRU management of GEM objects common, and > >>> migrate those fields into drm_mem_region and drm_gem_object strucures. > >>> > >>> Patches 1-2 implement the proposed struct drm_mem_region and adds > >>> associated common set of definitions for memory region type. > >>> > >>> Patch #3 is update to i915 and is based upon another series which is > >>> in progress to add vram support to i915 [2]. > >>> > >>> [1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html > >>> [2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html > >>> > >>> Brian Welty (3): > >>> drm: introduce new struct drm_mem_region > >>> drm: Introduce DRM_MEM defines for specifying type of drm_mem_region > >>> drm/i915: Update intel_memory_region to use nested drm_mem_region > >>> > >>> drivers/gpu/drm/i915/gem/i915_gem_object.c | 2 +- > >>> drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 2 +- > >>> drivers/gpu/drm/i915/i915_gem_gtt.c | 10 ++--- > >>> drivers/gpu/drm/i915/i915_gpu_error.c | 2 +- > >>> drivers/gpu/drm/i915/i915_query.c | 2 +- > >>> drivers/gpu/drm/i915/intel_memory_region.c | 10 +++-- > >>> drivers/gpu/drm/i915/intel_memory_region.h | 19 +++------ > >>> drivers/gpu/drm/i915/intel_region_lmem.c | 26 ++++++------- > >>> .../drm/i915/selftests/intel_memory_region.c | 8 ++-- > >>> drivers/gpu/drm/ttm/ttm_bo.c | 34 +++++++++------- > >>> drivers/gpu/drm/ttm/ttm_bo_manager.c | 14 +++---- > >>> drivers/gpu/drm/ttm/ttm_bo_util.c | 11 +++--- > >>> drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c | 8 ++-- > >>> drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 4 +- > >>> include/drm/drm_mm.h | 39 ++++++++++++++++++- > >>> include/drm/ttm/ttm_bo_api.h | 2 +- > >>> include/drm/ttm/ttm_bo_driver.h | 16 ++++---- > >>> include/drm/ttm/ttm_placement.h | 8 ++-- > >>> 18 files changed, 124 insertions(+), 93 deletions(-) > >>> > -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx https://lists.freedesktop.org/mailman/listinfo/dri-devel