Hi Oak, Christian
On 11/2/23 05:32, Oak Zeng wrote:
We plan to implement xe driver's shared virtual memory
manager (aka SVM) without buffer object concept. This
means we won't build our shared virtual memory manager
upon TTM infrastructure like amdgpu does.
Even though this approach is more efficient, it does
create a problem for memory eviction when there is
memory pressure: memory allocated by SVM and memory
allocated by TTM should be able to mutually evict
from each other. TTM's resource manager maintains
a LRU list for each memory type and this list is used
to pick up the memory eviction victim. Since we don't
use TTM for SVM implementation, SVM allocated memory
can't be added to TTM resource manager's LRU list. Thus
SVM allocated memory and TTM allocated memory are not
mutually evictable.
See more discussion on this topic here:
https://www.spinics.net/lists/dri-devel/msg410740.html
This series solve this problem by creating a shared
LRU list b/t SVM and TTM, or any other resource manager.
The basic idea is, abstract a drm_lru_entity structure
which is supposed to be embedded in ttm_resource structure,
or any other resource manager. The resource LRU list is a
list of drm_lru_entity. drm_lru_entity has eviction function
pointers which can be used to call back drivers' specific
eviction function to evict a memory resource.
Introduce global drm_lru_manager to struct drm_device
to manage LRU lists. Each memory type or memory region
can have a LRU list. TTM resource manager's LRU list functions
including bulk move functions are moved to drm lru manager.
drm lru manager provides a evict_first function to evict
the first memory resource from LRU list. This function can
be called from TTM, SVM or any other resource manager, so
all the memory allocated in the drm sub-system can be mutually
evicted.
The lru_lock is also moved from struct ttm_device to struct
drm_device.
Opens:
1) memory accounting: currently the ttm resource manager's
memory accounting functions is kept at ttm resource manager.
Since memory accounting should be cross TTM and SVM, it should
be ideally in the drm lru manager layer. This will be polished
in the future.
2) eviction callback function interface: The current eviction
function interface is designed to meet TTM memory eviction
requirements. When SVM is in the picture, this interface
need to be futher tunned to meet SVM requirement also.
This series is not tested and it is only compiled for xe
driver. Some minor changes are needed for other driver
such as amdgpu, nouveau etc. I intended to send this out
as a request for comment series to get some early feedback,
to see whether this is the right direction to go. I will
futher polish this series after a direction is agreed.
Oak Zeng (11):
drm/ttm: re-parameter ttm_device_init
drm: move lru_lock from ttm_device to drm_device
drm: introduce drm evictable LRU
drm: Add evict function pointer to drm lru entity
drm: Replace ttm macros with drm macros
drm/ttm: Set lru manager to ttm resource manager
drm/ttm: re-parameterize a few ttm functions
drm: Initialize drm lru manager
drm/ttm: Use drm LRU manager iterator
drm/ttm: Implement ttm memory evict functions
drm/ttm: Write ttm functions using drm lru manager functions
drivers/gpu/drm/Makefile | 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c | 6 +
.../gpu/drm/amd/amdgpu/amdgpu_preempt_mgr.c | 6 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 10 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 6 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 10 +-
drivers/gpu/drm/drm_drv.c | 1 +
drivers/gpu/drm/drm_evictable_lru.c | 266 ++++++++++++++++++
drivers/gpu/drm/drm_gem_vram_helper.c | 10 +-
drivers/gpu/drm/i915/gem/i915_gem_ttm.c | 6 +-
drivers/gpu/drm/i915/i915_ttm_buddy_manager.c | 10 +
drivers/gpu/drm/i915/intel_region_ttm.c | 4 +-
drivers/gpu/drm/i915/selftests/mock_region.c | 2 +-
drivers/gpu/drm/loongson/lsdc_ttm.c | 10 +-
drivers/gpu/drm/nouveau/nouveau_ttm.c | 22 +-
drivers/gpu/drm/qxl/qxl_ttm.c | 6 +-
drivers/gpu/drm/radeon/radeon_ttm.c | 10 +-
drivers/gpu/drm/ttm/tests/ttm_device_test.c | 2 +-
drivers/gpu/drm/ttm/tests/ttm_kunit_helpers.c | 2 +-
drivers/gpu/drm/ttm/ttm_bo.c | 247 ++++++++++++----
drivers/gpu/drm/ttm/ttm_bo_util.c | 20 +-
drivers/gpu/drm/ttm/ttm_bo_vm.c | 2 +-
drivers/gpu/drm/ttm/ttm_device.c | 55 ++--
drivers/gpu/drm/ttm/ttm_module.h | 3 +-
drivers/gpu/drm/ttm/ttm_range_manager.c | 14 +-
drivers/gpu/drm/ttm/ttm_resource.c | 242 +++-------------
drivers/gpu/drm/ttm/ttm_sys_manager.c | 8 +-
drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_bo.h | 2 +-
drivers/gpu/drm/vmwgfx/vmwgfx_drv.c | 6 +-
.../gpu/drm/vmwgfx/vmwgfx_system_manager.c | 6 +
drivers/gpu/drm/xe/xe_bo.c | 48 ++--
drivers/gpu/drm/xe/xe_bo.h | 5 +-
drivers/gpu/drm/xe/xe_device.c | 2 +-
drivers/gpu/drm/xe/xe_dma_buf.c | 4 +-
drivers/gpu/drm/xe/xe_exec.c | 6 +-
drivers/gpu/drm/xe/xe_migrate.c | 6 +-
drivers/gpu/drm/xe/xe_res_cursor.h | 10 +-
drivers/gpu/drm/xe/xe_ttm_sys_mgr.c | 8 +-
drivers/gpu/drm/xe/xe_ttm_vram_mgr.c | 18 +-
drivers/gpu/drm/xe/xe_vm.c | 6 +-
drivers/gpu/drm/xe/xe_vm_types.h | 2 +-
include/drm/drm_device.h | 12 +
include/drm/drm_evictable_lru.h | 260 +++++++++++++++++
include/drm/ttm/ttm_bo.h | 10 +-
include/drm/ttm/ttm_device.h | 13 +-
include/drm/ttm/ttm_range_manager.h | 17 +-
include/drm/ttm/ttm_resource.h | 117 +++-----
49 files changed, 1042 insertions(+), 501 deletions(-)
create mode 100644 drivers/gpu/drm/drm_evictable_lru.c
create mode 100644 include/drm/drm_evictable_lru.h
Some additional comments after looking through this patchset and the
comments.
1) General: IMO a good start.
2) Where should this reside? I'm not a big fan of *prescribing* that a
component should be part of a specific device structure. IMO that leads
to dumping the whole drm namespace into this component rather than to
keep it isolated with as few dependencies as possible. I would make the
part that should reside in the drm device a "struct drm_lru_device", and
the resource base class a "struct drm_lru_resource". Whether the drm
device then wants to embed the struct drm_lru_device (nice if you need
this component) or have a pointer to it (nice if you don't need this
component) and whether the ttm_device should subclass the drm device
becomes pretty orthogonal to the actual implementation, and we'd avoid
hard to follow code cross-references all over the place. The drm_gpuvm
IMO did a good job with this, but I know there are other opinions on
this, and I don't want it to become a blocker.
3) Christian's comment about cursors into the LRU for LRU traversal restart
This is quickly becoming a must for the Xe driver, and a very-nice to
have for a working TTM shrinker, but I think we should start to bring
what we have in TTM currently over and do the cursors as a separate
task. I have it pretty much on top of my priority list currently. Both
downstream i915 and drm_gpuvm have a way to handle this nicely by
permutating the LRU list just before the lock needs to be released, so I
was thinking something similar. The complication would be the bulk move
tracking.
4) The struct drm_lru_evict_ctx - This one needs to be common across all
users since it's set up by the evictor and used by the evictee ops.
/Thomas