Re: [PATCH drm-next 00/14] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Oded,

sorry for the late response, somehow this mail slipped through.

On 2/6/23 15:48, Oded Gabbay wrote:
On Thu, Jan 19, 2023 at 7:24 AM Matthew Brost <matthew.brost@xxxxxxxxx> wrote:
Is this not an application issue? Millions of mappings seems a bit
absurd to me.
If I look at the most extreme case for AI, assuming 256GB of HBM
memory and page mapping of 2MB, we get to 128K of mappings. But that's
really the extreme case imo. I assume most mappings will be much
larger. In fact, in the most realistic scenario of large-scale
training, a single user will probably map the entire HBM memory using
1GB pages.

I have also a question, could this GPUVA code manage VA ranges
mappings for userptr mappings, assuming we work without svm/uva/usm
(pointer-is-a-pointer) ? Because then we are talking about possible
4KB mappings of 1 - 1.5 TB host server RAM (Implied in my question is
the assumption this can be used also for non-VK use-cases. Please tell
me if I'm totally wrong here).

In V2 I switched from drm_mm to maple tree, which should improve handling of lots of entries. I also dropped the requirement for GPUVA entries to be backed by a valid GEM object.

I think it can be used for non-VK use-cases. It basically just keeps track of mappings (not allocating them in the sense of finding a hole and providing a base address for a given size). There are basic functions to insert and remove entries. For those basic functions it is ensured that colliding entries can't be inserted and only a specific given entry can be removed, rather than e.g. an arbitrary range.

There are also more advanced functions where users of the GPUVA manager can request to "force map" a new mapping and to unmap a given range. The GPUVA manager will figure out the (sub-)operations to make this happen (.e.g. remove mappings in the way, split up mappings, etc.) and either provide these operations (or steps) through callbacks or though a list of operations to the caller to process them.

Are there any other use-cases or features you could think of that would be beneficial for accelerators?

- Danilo


Thanks,
Oded





[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux