RE: Implement svm without BO concept in xe driver

"Zeng, Oak" <oak.zeng@xxxxxxxxx> · Mon, 21 Aug 2023 15:10:40 +0000

Accidently deleted Brian. Add back.

Thanks,
Oak

> -----Original Message-----
> From: Zeng, Oak
> Sent: August 21, 2023 11:07 AM
> To: Dave Airlie <airlied@xxxxxxxxx>
> Cc: Brost, Matthew <matthew.brost@xxxxxxxxx>; Thomas Hellström
> <thomas.hellstrom@xxxxxxxxxxxxxxx>; Philip Yang <Philip.Yang@xxxxxxx>; Felix
> Kuehling <felix.kuehling@xxxxxxx>; dri-devel@xxxxxxxxxxxxxxxxxxxxx; intel-
> xe@xxxxxxxxxxxxxxxxxxxxx; Vishwanathapura, Niranjana
> <niranjana.vishwanathapura@xxxxxxxxx>; Christian König
> <christian.koenig@xxxxxxx>
> Subject: RE: Implement svm without BO concept in xe driver
> 
> > -----Original Message-----
> > From: dri-devel <dri-devel-bounces@xxxxxxxxxxxxxxxxxxxxx> On Behalf Of Dave
> > Airlie
> > Sent: August 20, 2023 6:21 PM
> > To: Zeng, Oak <oak.zeng@xxxxxxxxx>
> > Cc: Brost, Matthew <matthew.brost@xxxxxxxxx>; Thomas Hellström
> > <thomas.hellstrom@xxxxxxxxxxxxxxx>; Philip Yang <Philip.Yang@xxxxxxx>;
> Felix
> > Kuehling <felix.kuehling@xxxxxxx>; Welty, Brian <brian.welty@xxxxxxxxx>;
> dri-
> > devel@xxxxxxxxxxxxxxxxxxxxx; intel-xe@xxxxxxxxxxxxxxxxxxxxx; Vishwanathapura,
> > Niranjana <niranjana.vishwanathapura@xxxxxxxxx>; Christian König
> > <christian.koenig@xxxxxxx>
> > Subject: Re: Implement svm without BO concept in xe driver
> >
> > On Thu, 17 Aug 2023 at 12:13, Zeng, Oak <oak.zeng@xxxxxxxxx> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Dave Airlie <airlied@xxxxxxxxx>
> > > > Sent: August 16, 2023 6:52 PM
> > > > To: Felix Kuehling <felix.kuehling@xxxxxxx>
> > > > Cc: Zeng, Oak <oak.zeng@xxxxxxxxx>; Christian König
> > > > <christian.koenig@xxxxxxx>; Thomas Hellström
> > > > <thomas.hellstrom@xxxxxxxxxxxxxxx>; Brost, Matthew
> > > > <matthew.brost@xxxxxxxxx>; maarten.lankhorst@xxxxxxxxxxxxxxx;
> > > > Vishwanathapura, Niranjana <niranjana.vishwanathapura@xxxxxxxxx>;
> Welty,
> > > > Brian <brian.welty@xxxxxxxxx>; Philip Yang <Philip.Yang@xxxxxxx>; intel-
> > > > xe@xxxxxxxxxxxxxxxxxxxxx; dri-devel@xxxxxxxxxxxxxxxxxxxxx
> > > > Subject: Re: Implement svm without BO concept in xe driver
> > > >
> > > > On Thu, 17 Aug 2023 at 08:15, Felix Kuehling <felix.kuehling@xxxxxxx>
> > wrote:
> > > > >
> > > > > On 2023-08-16 13:30, Zeng, Oak wrote:
> > > > > > I spoke with Thomas. We discussed two approaches:
> > > > > >
> > > > > > 1) make ttm_resource a central place for vram management functions
> > such as
> > > > eviction, cgroup memory accounting. Both the BO-based driver and BO-less
> > SVM
> > > > codes call into ttm_resource_alloc/free functions for vram allocation/free.
> > > > > >      *This way BO driver and SVM driver shares the eviction/cgroup logic,
> no
> > > > need to reimplment LRU eviction list in SVM driver. Cgroup logic should be
> in
> > > > ttm_resource layer. +Maarten.
> > > > > >      *ttm_resource is not a perfect match for SVM to allocate vram. It is
> still
> > a
> > > > big overhead. The *bo* member of ttm_resource is not needed for SVM -
> > this
> > > > might end up with invasive changes to ttm...need to look into more details
> > > > >
> > > > > Overhead is a problem. We'd want to be able to allocate, free and evict
> > > > > memory at a similar granularity as our preferred migration and page
> > > > > fault granularity, which defaults to 2MB in our SVM implementation.
> > > > >
> > > > >
> > > > > >
> > > > > > 2) svm code allocate memory directly from drm-buddy allocator, and
> > expose
> > > > memory eviction functions from both ttm and svm so they can evict
> memory
> > > > from each other. For example, expose the ttm_mem_evict_first function
> > from
> > > > ttm side so hmm/svm code can call it; expose a similar function from svm
> side
> > so
> > > > ttm can evict hmm memory.
> > > > >
> > > > > I like this option. One thing that needs some thought with this is how
> > > > > to get some semblance of fairness between the two types of clients.
> > > > > Basically how to choose what to evict. And what share of the available
> > > > > memory does each side get to use on average. E.g. an idle client may get
> > > > > all its memory evicted while a busy client may get a bigger share of the
> > > > > available memory.
> > > >
> > > > I'd also like to suggest we try to write any management/generic code
> > > > in driver agnostic way as much as possible here. I don't really see
> > > > much hw difference should be influencing it.
> > > >
> > > > I do worry about having effectively 2 LRUs here, you can't really have
> > > > two "leasts".
> > > >
> > > > Like if we hit the shrinker paths who goes first? do we shrink one
> > > > object from each side in turn?
> > >
> > > One way to solve this fairness problem is to create a driver agnostic
> > drm_vram_mgr. Maintain a single LRU in drm_vram_mgr. Move the memory
> > eviction/cgroups memory accounting logic from ttm_resource manager to
> > drm_vram_mgr. Both BO-based driver and SVM driver calls to drm_vram_mgr
> to
> > allocate/free memory.
> > >
> > > I am not sure whether this meets the 2M allocate/free/evict granularity
> > requirement Felix mentioned above. SVM can allocate 2M size blocks. But BO
> > driver should be able to allocate any arbitrary sized blocks - So the eviction is
> also
> > arbitrary size.
> > >
> > > >
> > > > Also will we have systems where we can expose system SVM but userspace
> > > > may choose to not use the fine grained SVM and use one of the older
> > > > modes, will that path get emulated on top of SVM or use the BO paths?
> > >
> > >
> > > If by "older modes" you meant the gem_bo_create (such as xe_gem_create
> or
> > amdgpu_gem_create), then today both amd and intel implement those
> > interfaces using BO path. We don't have a plan to emulate that old mode on
> tope
> > of SVM, afaict.
> >
> > I'm not sure how the older modes manifest in the kernel I assume as bo
> > creates (but they may use userptr), SVM isn't a specific thing, it's a
> > group of 3 things.
> >
> > 1) coarse-grained SVM which I think is BO
> > 2) fine-grained SVM which is page level
> > 3) fine-grained system SVM which is HMM
> >
> > I suppose I'm asking about the previous versions and how they would
> > operate in a system SVM capable system.
> 
> I got your question now.
> 
> As I understand it, the system SVM provides similar functionality as BO-based
> SVM (i.e., share virtual address space b/t cpu and gpu program, no explicit
> memory placement for gpu program), but they have different user interface
> (malloc, mmap vs bo create, vm bind).
> 
> From functionality perspective, on a system SVM capable system, we don't need
> #1/#2. Once #3 is implemented and turned out be as performant as #1/#2, we
> can ask user space to switch to #3.
> 
> As far as I know, AMD doesn't have #1/#2 - their BO-based driver *requires* all
> valid GPU virtual address be mapped to GPU page table *before* GPU kernel
> submission, aka a GPU page fault is treated as fatal. Felix please fix me, as my
> AMD knowledge is fading away...
> 
> From interface perspective, i.e., to keep UMD which using #1/#2 continue to run
> without modification, we need #1/#2 to continue exist.
> 
> Should we emulate #1/#2 on top of #3? I feel the BO-based memory
> management and the struct page/hmm based memory management are quite
> different design philosophy. Trying to emulate one on top of another can run into
> serious difficulty. For example, how do we emulate a vm_bind on top of #3?
> Remember for #1/#2 virtual address space is managed by user space while #3
> virtual address space is managed by kernel core mm (vma struct...). It is a hard
> conflict here...
> 
> Thanks again for the great question!
> Oak
> 
> >
> > Dave.
> > >
> > > Thanks,
> > > Oak
> > >
> > > >
> > > > Dave.