RE: Implement svm without BO concept in xe driver

"Zeng, Oak" <oak.zeng@xxxxxxxxx> · Wed, 16 Aug 2023 03:47:58 +0000

Hi Felix,

It is great to hear from you!

When I implement the HMM-based SVM for intel devices, I found this interesting problem: HMM uses struct page based memory management scheme which is completely different against the BO/TTM style memory management philosophy. Writing SVM code upon the BO/TTM concept seems overkill and awkward. So I thought we better make the SVM code BO-less and TTM-less. But on the other hand, currently vram eviction and cgroup memory accounting are all hooked to the TTM layer, which means a TTM-less SVM driver won't be able to evict vram allocated through TTM/gpu_vram_mgr.

Ideally HMM migration should use drm-buddy for vram allocation, but we need to solve this TTM/HMM mutual eviction problem as you pointed out (I am working with application engineers to figure out whether mutual eviction can truly benefit applications). Maybe we can implement a TTM-less vram management block which can be shared b/t the HMM-based driver and the BO-based driver:
   * allocate/free memory from drm-buddy, buddy-block based
   * memory eviction logics, allow driver to specify which allocation is evictable
   * memory accounting, cgroup logic

Maybe such a block can be placed at drm layer (say, call it drm_vram_mgr for now), so it can be shared b/t amd and intel. So I involved amd folks. Today both amd and intel-xe driver implemented a TTM-based vram manager which doesn't serve above design goal. Once the drm_vram_mgr is implemented, both amd and intel's BO-based/TTM-based vram manager, and the HMM-based vram manager can call into this drm-vram-mgr.

Thanks again,
Oak

> -----Original Message-----
> From: Felix Kuehling <felix.kuehling@xxxxxxx>
> Sent: August 15, 2023 6:17 PM
> To: Zeng, Oak <oak.zeng@xxxxxxxxx>; Thomas Hellström
> <thomas.hellstrom@xxxxxxxxxxxxxxx>; Brost, Matthew
> <matthew.brost@xxxxxxxxx>; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@xxxxxxxxx>; Welty, Brian <brian.welty@xxxxxxxxx>;
> Christian König <christian.koenig@xxxxxxx>; Philip Yang
> <Philip.Yang@xxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx; dri-
> devel@xxxxxxxxxxxxxxxxxxxxx
> Subject: Re: Implement svm without BO concept in xe driver
> 
> Hi Oak,
> 
> I'm not sure what you're looking for from AMD? Are we just CC'ed FYI? Or
> are you looking for comments about
> 
>   * Our plans for VRAM management with HMM
>   * Our experience with BO-based VRAM management
>   * Something else?
> 
> IMO, having separate memory pools for HMM and TTM is a non-starter for
> AMD. We need access to the full VRAM in either of the APIs for it to be
> useful. That also means, we need to handle memory pressure in both
> directions. That's one of the main reasons we went with the BO-based
> approach initially. I think in the long run, using the buddy allocator,
> or the amdgpu_vram_mgr directly for HMM migrations would be better,
> assuming we can handle memory pressure in both directions between HMM
> and TTM sharing the same pool of physical memory.
> 
> Regards,
>    Felix
> 
> 
> On 2023-08-15 16:34, Zeng, Oak wrote:
> >
> > Also + Christian
> >
> > Thanks,
> >
> > Oak
> >
> > *From:*Intel-xe <intel-xe-bounces@xxxxxxxxxxxxxxxxxxxxx> *On Behalf Of
> > *Zeng, Oak
> > *Sent:* August 14, 2023 11:38 PM
> > *To:* Thomas Hellström <thomas.hellstrom@xxxxxxxxxxxxxxx>; Brost,
> > Matthew <matthew.brost@xxxxxxxxx>; Vishwanathapura, Niranjana
> > <niranjana.vishwanathapura@xxxxxxxxx>; Welty, Brian
> > <brian.welty@xxxxxxxxx>; Felix Kuehling <felix.kuehling@xxxxxxx>;
> > Philip Yang <Philip.Yang@xxxxxxx>; intel-xe@xxxxxxxxxxxxxxxxxxxxx;
> > dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > *Subject:* [Intel-xe] Implement svm without BO concept in xe driver
> >
> > Hi Thomas, Matt and all,
> >
> > This came up when I port i915 svm codes to xe driver. In i915
> > implementation, we have i915_buddy manage gpu vram and svm codes
> > directly call i915_buddy layer to allocate/free vram. There is no
> > gem_bo/ttm bo concept involved in the svm implementation.
> >
> > In xe driver,  we have drm_buddy, xe_ttm_vram_mgr and ttm layer to
> > manage vram. Drm_buddy is initialized during xe_ttm_vram_mgr
> > initialization. Vram allocation/free is done through xe_ttm_vram_mgr
> > functions which call into drm_buddy layer to allocate vram blocks.
> >
> > I plan to implement xe svm driver the same way as we did in i915,
> > which means there will not be bo concept in the svm implementation.
> > Drm_buddy will be passed to svm layer during vram initialization and
> > svm will allocate/free memory directly from drm_buddy, bypassing
> > ttm/xee vram manager. Here are a few considerations/things we are
> > aware of:
> >
> >  1. This approach seems match hmm design better than bo concept. Our
> >     svm implementation will be based on hmm. In hmm design, each vram
> >     page is backed by a struct page. It is very easy to perform page
> >     granularity migrations (b/t vram and system memory). If BO concept
> >     is involved, we will have to split/remerge BOs during page
> >     granularity migrations.
> >
> >  2. We have a prove of concept of this approach in i915, originally
> >     implemented by Niranjana. It seems work but it only has basic
> >     functionalities for now. We don’t have advanced features such as
> >     memory eviction etc.
> >
> >  3. With this approach, vram will divided into two separate pools: one
> >     for xe_gem_created BOs and one for vram used by svm. Those two
> >     pools are not connected: memory pressure from one pool won’t be
> >     able to evict vram from another pool. At this point, we don’t
> >     whether this aspect is good or not.
> >
> >  4. Amdkfd svm went different approach which is BO based. The benefit
> >     of this approach is a lot of existing driver facilities (such as
> >     memory eviction/cgroup/accounting) can be reused
> >
> > Do you have any comment to this approach? Should I come back with a
> > RFC of some POC codes?
> >
> > Thanks,
> >
> > Oak
> >