Re: VM on GPUs

Alex Deucher <alexdeucher@xxxxxxxxx> · Fri, 20 Feb 2015 17:19:35 -0500

On Fri, Feb 20, 2015 at 12:35 PM, Jan Vesely <jan.vesely@xxxxxxxxxxx> wrote:
> Hello radeon devs,
>
> I have been trying to find out more about VM implementation on SI+ hw,
> but unfortunately I could not find much in the public documents[0].
>
> SI ISA manual suggests that there is a limited form of privileged mode
> on these chips, so I wondered if it could be used for VM management too
> (the docs only deal with numerical exceptions). Or does it always have
> to be handled by host (driver)?

These are related to trap/exception privilege for debugging for
example.  I'm not that familiar with how that stuff works.  It's
unrelated to GPUVM.

>
> One of the older patches [1] mentions different page sizes, is there any
> public documentation on things like page table format, and GPU MMU
> hierarchy? I could only get limited picture going through the code and
> comments.

There is not any public documentation on the VM hardware other than
what is available in the driver.  I can try and give you an overview
of how it works.  There are 16 VM contexts (8 on cayman/TN/RL) on the
GPU that can be active at any given time.  GPUVM supports a 40 bit
address space.  Each context has an id, we call them vmids.  vmid 0 is
a bit special.  It's called the system context and behaves a bit
differently to the other ones.  It's designed to be for the kernel
driver's view of GPU accessible memory.  I can go into further detail
if you want, but I don't think it's critical for this discussion.
Just think of it as the context used by the kernel driver.  So that
leaves 16 contexts (7 on cayman and TN/RL) available for use by user
clients.  vmid 0 has one set of configuration registers and vmids 1-15
share the same configuration (other than the page tables).  E.g.,
contexts 1-15 all have to use single or 2 level page tables for
example.  You select which VM context is used for a particular command
buffer by a field in the command buffer packet.  Some engines (e.g.,
UVD or the display hardware) do not support VM so they always use vmid
0.  Right now only the graphics, compute, and DMA engines support VM.

With single level page tables, you just have a big array of page table
entries (PTEs) that represent the entire virtual address space.  With
multi-level page tables, the address space is represented by an array
of page directory entries (PDEs) that point to page table blocks
(PTBs) which are arrays of PTEs.

PTEs and PDEs are 64 bits per entry.

PDE:
39:12 - PTB address
0 - PDE valid (the entry is valid)

PTE:
39:12 - page address
11:7 - fragment
6 - write
5 - read
2 - CPU cache snoop (for accessing cached system memory)
1 - system (page is in system memory rather than vram)
0 - PTE valid (the entry is valid)

Fragment needs some explanation. The logical/physical fragment size in
bytes = 2 ^ (12 + fragment).  A fragment size of 0 means 4k, 1 means,
8k, etc.  The logical address must be aligned to the fragment size and
the memory backing it must be contiguous and at least as large as the
fragment size.  Larger fragment sizes reduce the pressure on the TLB
since fewer entries are required for the same amount of memory.

For system pages, the page address is the dma address of the page.
The system bit must be set and the snoop bit can be optionally set
depending on whether you are using cachable memory.

For vram pages, the address is the GPU physical address of vram
(starts at 0 on dGPUs, starts at MC_VM_FB_OFFSET (dma address of
"vram" carve out) on APUs).

You can also adjust the page table block size which controls the
number of pages per PTB.  I.e., how many PDEs you need to cover the
address space.  E.g., if you set the block size to 0, each PTB is 4k
which holds 512 PTEs; if you set it to 1 each PTB is 8k which holds
1024 PTEs, etc.

GPUVM is only concerned with memory management and protection.  There
are other protection features in other hw blocks for things beyond
memory.  For example, on CI and newer asics, the CP and SDMA blocks
execute command buffers in a secure mode that limits them to accessing
only registers that are relevant for those blocks (e.g., gfx or
compute state registers, but not display registers) or only executing
certain packets.

I hope this helps.  Let me know if you have any more questions.

Alex

>
>
> thank you,
> Jan
>
> [0]http://developer.amd.com/resources/documentation-articles/developer-guides-manuals/
> [1]http://lists.freedesktop.org/archives/dri-devel/2014-May/058858.html
>
>
> --
> Jan Vesely <jan.vesely@xxxxxxxxxxx>
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel