Re: [RFC PATCH 0/3] Propose new struct drm_mem_region

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 31, 2019 at 8:54 AM Koenig, Christian
<Christian.Koenig@xxxxxxx> wrote:
>
> Am 31.07.19 um 02:51 schrieb Brian Welty:
> [SNIP]
> >>>>>> +/*
> >>>>>> + * Memory types for drm_mem_region
> >>>>>> + */
> >>>>> #define DRM_MEM_SWAP    ?
> >>>> btw what did you have in mind for this? Since we use shmem we kinda don't
> >>>> know whether the BO is actually swapped out or not, at least on the i915
> >>>> side. So this would be more NOT_CURRENTLY_PINNED_AND_POSSIBLY_SWAPPED_OUT.
> >>> Yeah, the problem is not everybody can use shmem. For some use cases you
> >>> have to use memory allocated through dma_alloc_coherent().
> >>>
> >>> So to be able to swap this out you need a separate domain to copy it
> >>> from whatever is backing it currently to shmem.
> >>>
> >>> So we essentially have:
> >>> DRM_MEM_SYS_SWAPABLE
> >>> DRM_MEM_SYS_NOT_GPU_MAPPED
> >>> DRM_MEM_SYS_GPU_MAPPED
> >>>
> >>> Or something like that.
> >> Yeah i915-gem is similar. We oportunistically keep the pages pinned
> >> sometimes even if not currently mapped into the (what ttm calls) TT.
> >> So I think these three for system memory make sense for us too. I
> >> think that's similar (at least in spirit) to the dma_alloc cache you
> >> have going on. Mabye instead of the somewhat cumbersome NOT_GPU_MAPPED
> >> we could have something like PINNED or so. Although it's not
> >> permanently pinned, so maybe that's confusing too.
> >>
> > Okay, I see now I was far off the mark with what I thought TTM_PL_SYSTEM
> > was.  The discussion helped clear up several bits of confusion on my part.
> >  From proposed names, I find MAPPED and PINNED slightly confusing.
> > In terms of backing store description, maybe these are a little better:
> >    DRM_MEM_SYS_UNTRANSLATED  (TTM_PL_SYSTEM)
> >    DRM_MEM_SYS_TRANSLATED    (TTM_PL_TT or i915's SYSTEM)
>
> That's still not correct. Let me describe what each of the tree stands for:
>
> 1. The backing store is a shmem file so the individual pages are
> swapable by the core OS.
> 2. The backing store is allocate GPU accessible but not currently in use
> by the GPU.
> 3. The backing store is currently in use by the GPU.
>
> For i915 all three of those are basically the same and you only need to
> worry about it much.

We do pretty much have these three states for i915 gem bo too. Of
course none have a reasonable upper limit since it's all shared
memory. The hard limit would be system memory + swap for 1, and only
system memory for 2 and 3.

> But for other drivers that's certainly not true and we need this
> distinction of the backing store of an object.
>
> I'm just not sure how we would handle that for cgroups. From experience
> we certainly want a limit over all 3, but you usually also want to limit
> 3 alone.

To avoid lolz against the shrinker I think you also want to limit 2+3.
Afaiui ttm does that with the global limit, to avoid driving the
system against the wall.

> And you also want to limit the amount of bytes moved between those
> states because each state transition might have a bandwidth cost
> associated with it.
>
> > Are these allowed to be both overlapping? Or non-overlapping (partitioned)?
> > Per Christian's point about removing .start, seems it doesn't need to
> > matter.
>
> You should probably completely drop the idea of this being regions.
>
> And we should also rename them to something like drm_mem_domains to make
> that clear.

+1 on domains. Some of these domains might be physically contiguous
regions, but some clearly arent.

> > Whatever we define for these sub-types, does it make sense for SYSTEM and
> > VRAM to each have them defined?
>
> No, absolutely not. VRAM as well as other private memory types are
> completely driver specific.
>
> > I'm unclear how DRM_MEM_SWAP (or DRM_MEM_SYS_SWAPABLE) would get
> > configured by driver...  this is a fixed size partition of host memory?
> > Or it is a kind of dummy memory region just for swap implementation?
>
> #1 and #2 in my example above should probably not be configured by the
> driver itself.
>
> And yes seeing those as special for state handling sounds like the
> correct approach to me.

Do we have any hw that wants custom versions of 3? The only hw designs
I know of either have one shared translation table (but only one per
device, so having just 1 domain is good enough). Or TT mappings are in
the per-process pagetables, and then you're defacto unlimited (and
again one domain is good enough). So roughly:

- 1&2 global accross all drivers. 1 and 2 are disjoint (i.e. a bo is
only account to one of them, never both).
- 3 would be a subgroup of 2, and per device. A bo in group 3 is also
always in group 2.

For VRAM and VRAM-similar things (like stolen system memory, or if you
have VRAM that's somehow split up like with a dual gpu perhaps) I
agree the driver needs to register that. And we just have some
standard flags indicating that "this is kinda like VRAM".
-Daniel

>
> Regards,
> Christian.
>
> >>>>> TTM was clearly missing that resulting in a whole bunch of extra
> >>>>> handling and rather complicated handling.
> >>>>>
> >>>>>> +#define DRM_MEM_SYSTEM     0
> >>>>>> +#define DRM_MEM_STOLEN     1
> >>>>> I think we need a better naming for that.
> >>>>>
> >>>>> STOLEN sounds way to much like stolen VRAM for integrated GPUs, but at
> >>>>> least for TTM this is the system memory currently GPU accessible.
> >>>> Yup this is wrong, for i915 we use this as stolen, for ttm it's the gpu
> >>>> translation table window into system memory. Not the same thing at all.
> >>> Thought so. The closest I have in mind is GTT, but everything else works
> >>> as well.
> >> Would your GPU_MAPPED above work for TT? I think we'll also need
> >> STOLEN, I'm even hearing noises that there's going to be stolen for
> >> discrete vram for us ... Also if we expand I guess we need to teach
> >> ttm to cope with more, or maybe treat the DRM one as some kind of
> >> sub-flavour.
> > Daniel, maybe what i915 calls stolen could just be DRM_MEM_RESERVED or
> > DRM_MEM_PRIV.  Or maybe can argue it falls into UNTRANSLATED type that
> > I suggested above, I'm not sure.
> >
> > -Brian
> >
> >
> >> -Daniel
> >>
> >>> Christian.
> >>>
> >>>> -Daniel
> >>>>
> >>>>> Thanks for looking into that,
> >>>>> Christian.
> >>>>>
> >>>>> Am 30.07.19 um 02:32 schrieb Brian Welty:
> >>>>>> [ By request, resending to include amd-gfx + intel-gfx.  Since resending,
> >>>>>>      I fixed the nit with ordering of header includes that Sam noted. ]
> >>>>>>
> >>>>>> This RFC series is first implementation of some ideas expressed
> >>>>>> earlier on dri-devel [1].
> >>>>>>
> >>>>>> Some of the goals (open for much debate) are:
> >>>>>>      - Create common base structure (subclass) for memory regions (patch #1)
> >>>>>>      - Create common memory region types (patch #2)
> >>>>>>      - Create common set of memory_region function callbacks (based on
> >>>>>>        ttm_mem_type_manager_funcs and intel_memory_regions_ops)
> >>>>>>      - Create common helpers that operate on drm_mem_region to be leveraged
> >>>>>>        by both TTM drivers and i915, reducing code duplication
> >>>>>>      - Above might start with refactoring ttm_bo_manager.c as these are
> >>>>>>        helpers for using drm_mm's range allocator and could be made to
> >>>>>>        operate on DRM structures instead of TTM ones.
> >>>>>>      - Larger goal might be to make LRU management of GEM objects common, and
> >>>>>>        migrate those fields into drm_mem_region and drm_gem_object strucures.
> >>>>>>
> >>>>>> Patches 1-2 implement the proposed struct drm_mem_region and adds
> >>>>>> associated common set of definitions for memory region type.
> >>>>>>
> >>>>>> Patch #3 is update to i915 and is based upon another series which is
> >>>>>> in progress to add vram support to i915 [2].
> >>>>>>
> >>>>>> [1] https://lists.freedesktop.org/archives/dri-devel/2019-June/224501.html
> >>>>>> [2] https://lists.freedesktop.org/archives/intel-gfx/2019-June/203649.html
> >>>>>>
> >>>>>> Brian Welty (3):
> >>>>>>      drm: introduce new struct drm_mem_region
> >>>>>>      drm: Introduce DRM_MEM defines for specifying type of drm_mem_region
> >>>>>>      drm/i915: Update intel_memory_region to use nested drm_mem_region
> >>>>>>
> >>>>>>     drivers/gpu/drm/i915/gem/i915_gem_object.c    |  2 +-
> >>>>>>     drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  2 +-
> >>>>>>     drivers/gpu/drm/i915/i915_gem_gtt.c           | 10 ++---
> >>>>>>     drivers/gpu/drm/i915/i915_gpu_error.c         |  2 +-
> >>>>>>     drivers/gpu/drm/i915/i915_query.c             |  2 +-
> >>>>>>     drivers/gpu/drm/i915/intel_memory_region.c    | 10 +++--
> >>>>>>     drivers/gpu/drm/i915/intel_memory_region.h    | 19 +++------
> >>>>>>     drivers/gpu/drm/i915/intel_region_lmem.c      | 26 ++++++-------
> >>>>>>     .../drm/i915/selftests/intel_memory_region.c  |  8 ++--
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo.c                  | 34 +++++++++-------
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo_manager.c          | 14 +++----
> >>>>>>     drivers/gpu/drm/ttm/ttm_bo_util.c             | 11 +++---
> >>>>>>     drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c |  8 ++--
> >>>>>>     drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c    |  4 +-
> >>>>>>     include/drm/drm_mm.h                          | 39 ++++++++++++++++++-
> >>>>>>     include/drm/ttm/ttm_bo_api.h                  |  2 +-
> >>>>>>     include/drm/ttm/ttm_bo_driver.h               | 16 ++++----
> >>>>>>     include/drm/ttm/ttm_placement.h               |  8 ++--
> >>>>>>     18 files changed, 124 insertions(+), 93 deletions(-)
> >>>>>>
> >>
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux