unified LRU for ttm and svm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello all,

 

As a follow up to this thread https://www.spinics.net/lists/dri-devel/msg410740.html, I looked further into the idea of a shared LRU list for both ttm/bo and svm (to achieve a mutual eviction b/t them). I came up a rough design which I think better to align with you before I move too far.

 

As illustrated in below diagram:

 

  1. There will be a global drm_lru_manager to maintain the shared LRU list. Each memory type will have a list, i.e., system memory has a list, gpu memory has a list. On system which has multiple gpu memory regions, we can have multiple GPU LRU
  2. Move the LRU operation functions (such as bulk_move related) from ttm_resource_manager to drm_lru_manager
  3. Drm_lru_manager should be initialized during device initialization. Ttm layer or svm layer can have weak reference to it for convenience.
  4. Abstract a drm_lru_entity: This is supposed to be embedded in ttm_resource and svm_resource struct, as illustrated. Since ttm_resource and svm_resource are quite different in nature (ttm_resource is coupled with bo and svm_resource is struct page/pfn based), we can’t provide unified eviction function for them. So a evict_func pointer is introduced in drm_lru_entity[Note 1].
  5. Lru_lock. Currently the lru_lock is in ttm_device structure. Ideally this can be moved to drm_lru_manager. But besides the lru list, lru_lock also protect other ttm specific thing such as ttm_device’s pinned list. The current plan is to move lru_lock to xe_device/amdgpu_device and ttm_device or svm can have a weak reference for convenience.

 

 

 

Note 1: I have been considering a structure like below. Each hmm/svm resource page is backed by a struct page and struct page already has a lru member. So theoretically  the LRU list can be as below. This way we don’t need to introduce the drm_lru_entity struct. The difficulty is, without modify the linux struct page, we can’t cast a lru node to struct page or struct ttm_resource, since we don’t know whether this node is used by ttm or svm. This is why I had to introduce drm_lru_entity to hold an evict_function above. But let me know if you have better idea.

 

 

Thanks,

Oak

 


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux