Re: [PATCH 00/11] Introduce drm evictable lru

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Oak, Christian

On 11/2/23 05:32, Oak Zeng wrote:
We plan to implement xe driver's shared virtual memory
manager (aka SVM) without buffer object concept. This
means we won't build our shared virtual memory manager
upon TTM infrastructure like amdgpu does.

Even though this approach is more efficient, it does
create a problem for memory eviction when there is
memory pressure: memory allocated by SVM and memory
allocated by TTM should be able to mutually evict
from each other. TTM's resource manager maintains
a LRU list for each memory type and this list is used
to pick up the memory eviction victim. Since we don't
use TTM for SVM implementation, SVM allocated memory
can't be added to TTM resource manager's LRU list. Thus
SVM allocated memory and TTM allocated memory are not
mutually evictable.

See more discussion on this topic here:
https://www.spinics.net/lists/dri-devel/msg410740.html

This series solve this problem by creating a shared
LRU list b/t SVM and TTM, or any other resource manager.

The basic idea is, abstract a drm_lru_entity structure
which is supposed to be embedded in ttm_resource structure,
or any other resource manager. The resource LRU list is a
list of drm_lru_entity. drm_lru_entity has eviction function
pointers which can be used to call back drivers' specific
eviction function to evict a memory resource.

Introduce global drm_lru_manager to struct drm_device
to manage LRU lists. Each memory type or memory region
can have a LRU list. TTM resource manager's LRU list functions
including bulk move functions are moved to drm lru manager.
drm lru manager provides a evict_first function to evict
the first memory resource from LRU list. This function can
be called from TTM, SVM or any other resource manager, so
all the memory allocated in the drm sub-system can be mutually
evicted.

The lru_lock is also moved from struct ttm_device to struct
drm_device.

Opens:
1) memory accounting: currently the ttm resource manager's
memory accounting functions is kept at ttm resource manager.
Since memory accounting should be cross TTM and SVM, it should
be ideally in the drm lru manager layer. This will be polished
in the future.

2) eviction callback function interface: The current eviction
function interface is designed to meet TTM memory eviction
requirements. When SVM is in the picture, this interface
need to be futher tunned to meet SVM requirement also.

This series is not tested and it is only compiled for xe
driver. Some minor changes are needed for other driver
such as amdgpu, nouveau etc. I intended to send this out
as a request for comment series to get some early feedback,
to see whether this is the right direction to go. I will
futher polish this series after a direction is agreed.

Oak Zeng (11):
   drm/ttm: re-parameter ttm_device_init
   drm: move lru_lock from ttm_device to drm_device
   drm: introduce drm evictable LRU
   drm: Add evict function pointer to drm lru entity
   drm: Replace ttm macros with drm macros
   drm/ttm: Set lru manager to ttm resource manager
   drm/ttm: re-parameterize a few ttm functions
   drm: Initialize drm lru manager
   drm/ttm: Use drm LRU manager iterator
   drm/ttm: Implement ttm memory evict functions
   drm/ttm: Write ttm functions using drm lru manager functions

  drivers/gpu/drm/Makefile                      |   1 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c   |   6 +
  .../gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c   |   6 +
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c       |  10 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c        |   6 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h        |   2 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c  |  10 +-
  drivers/gpu/drm/drm_drv.c                     |   1 +
  drivers/gpu/drm/drm_evictable_lru.c           | 266 ++++++++++++++++++
  drivers/gpu/drm/drm_gem_vram_helper.c         |  10 +-
  drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |   6 +-
  drivers/gpu/drm/i915/i915_ttm_buddy_manager.c |  10 +
  drivers/gpu/drm/i915/intel_region_ttm.c       |   4 +-
  drivers/gpu/drm/i915/selftests/mock_region.c  |   2 +-
  drivers/gpu/drm/loongson/lsdc_ttm.c           |  10 +-
  drivers/gpu/drm/nouveau/nouveau_ttm.c         |  22 +-
  drivers/gpu/drm/qxl/qxl_ttm.c                 |   6 +-
  drivers/gpu/drm/radeon/radeon_ttm.c           |  10 +-
  drivers/gpu/drm/ttm/tests/ttm_device_test.c   |   2 +-
  drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c |   2 +-
  drivers/gpu/drm/ttm/ttm_bo.c                  | 247 ++++++++++++----
  drivers/gpu/drm/ttm/ttm_bo_util.c             |  20 +-
  drivers/gpu/drm/ttm/ttm_bo_vm.c               |   2 +-
  drivers/gpu/drm/ttm/ttm_device.c              |  55 ++--
  drivers/gpu/drm/ttm/ttm_module.h              |   3 +-
  drivers/gpu/drm/ttm/ttm_range_manager.c       |  14 +-
  drivers/gpu/drm/ttm/ttm_resource.c            | 242 +++-------------
  drivers/gpu/drm/ttm/ttm_sys_manager.c         |   8 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_bo.c            |   2 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_bo.h            |   2 +-
  drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |   6 +-
  .../gpu/drm/vmwgfx/vmwgfx_system_manager.c    |   6 +
  drivers/gpu/drm/xe/xe_bo.c                    |  48 ++--
  drivers/gpu/drm/xe/xe_bo.h                    |   5 +-
  drivers/gpu/drm/xe/xe_device.c                |   2 +-
  drivers/gpu/drm/xe/xe_dma_buf.c               |   4 +-
  drivers/gpu/drm/xe/xe_exec.c                  |   6 +-
  drivers/gpu/drm/xe/xe_migrate.c               |   6 +-
  drivers/gpu/drm/xe/xe_res_cursor.h            |  10 +-
  drivers/gpu/drm/xe/xe_ttm_sys_mgr.c           |   8 +-
  drivers/gpu/drm/xe/xe_ttm_vram_mgr.c          |  18 +-
  drivers/gpu/drm/xe/xe_vm.c                    |   6 +-
  drivers/gpu/drm/xe/xe_vm_types.h              |   2 +-
  include/drm/drm_device.h                      |  12 +
  include/drm/drm_evictable_lru.h               | 260 +++++++++++++++++
  include/drm/ttm/ttm_bo.h                      |  10 +-
  include/drm/ttm/ttm_device.h                  |  13 +-
  include/drm/ttm/ttm_range_manager.h           |  17 +-
  include/drm/ttm/ttm_resource.h                | 117 +++-----
  49 files changed, 1042 insertions(+), 501 deletions(-)
  create mode 100644 drivers/gpu/drm/drm_evictable_lru.c
  create mode 100644 include/drm/drm_evictable_lru.h

Some additional comments after looking through this patchset and the comments.

1) General: IMO a good start.

2) Where should this reside? I'm not a big fan of *prescribing* that a component should be part of a specific device structure. IMO that leads to dumping the whole drm namespace into this component rather than to keep it isolated with as few dependencies as possible. I would make the part that should reside in the drm device a "struct drm_lru_device", and the resource base class a "struct drm_lru_resource". Whether the drm device then wants to embed the struct drm_lru_device (nice if you need this component) or have a pointer to it (nice if you don't need this component) and whether the ttm_device should subclass the drm device becomes pretty orthogonal to the actual implementation, and we'd avoid hard to follow code cross-references all over the place. The drm_gpuvm IMO did a good job with this, but I know there are other opinions on this, and I don't want it to become a blocker.

3) Christian's comment about cursors into the LRU for LRU traversal restart
This is quickly becoming a must for the Xe driver, and a very-nice to have for a working TTM shrinker, but I think we should start to bring what we have in TTM currently over and do the cursors as a separate task. I have it pretty much on top of my priority list currently. Both downstream i915 and drm_gpuvm have a way to handle this nicely by permutating the LRU list just before the lock needs to be released, so I was thinking something similar. The complication would be the bulk move tracking.

4) The struct drm_lru_evict_ctx - This one needs to be common across all users since it's set up by the evictor and used by the evictee ops.

/Thomas





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux