Signed-off-by: Mike Rapoport <rppt@xxxxxxxxxxxxxxxxxx> --- Documentation/vm/hmm.txt | 66 ++++++++++++++++++++---------------------------- 1 file changed, 28 insertions(+), 38 deletions(-) diff --git a/Documentation/vm/hmm.txt b/Documentation/vm/hmm.txt index 4d3aac9..3fafa33 100644 --- a/Documentation/vm/hmm.txt +++ b/Documentation/vm/hmm.txt @@ -1,4 +1,8 @@ +.. hmm: + +===================================== Heterogeneous Memory Management (HMM) +===================================== Transparently allow any component of a program to use any memory region of said program with a device without using device specific memory allocator. This is @@ -14,19 +18,10 @@ deals with how device memory is represented inside the kernel. Finaly the last section present the new migration helper that allow to leverage the device DMA engine. +.. contents:: :local: -1) Problems of using device specific memory allocator: -2) System bus, device memory characteristics -3) Share address space and migration -4) Address space mirroring implementation and API -5) Represent and manage device memory from core kernel point of view -6) Migrate to and from device memory -7) Memory cgroup (memcg) and rss accounting - - -------------------------------------------------------------------------------- - -1) Problems of using device specific memory allocator: +Problems of using device specific memory allocator +================================================== Device with large amount of on board memory (several giga bytes) like GPU have historically manage their memory through dedicated driver specific API. This @@ -68,9 +63,8 @@ only do-able with a share address. It is as well more reasonable to use a share address space for all the other patterns. -------------------------------------------------------------------------------- - -2) System bus, device memory characteristics +System bus, device memory characteristics +========================================= System bus cripple share address due to few limitations. Most system bus only allow basic memory access from device to main memory, even cache coherency is @@ -100,9 +94,8 @@ access any memory memory but we must also permit any memory to be migrated to device memory while device is using it (blocking CPU access while it happens). -------------------------------------------------------------------------------- - -3) Share address space and migration +Share address space and migration +================================= HMM intends to provide two main features. First one is to share the address space by duplication the CPU page table into the device page table so same @@ -140,14 +133,13 @@ leverage device memory by migrating part of data-set that is actively use by a device. -------------------------------------------------------------------------------- - -4) Address space mirroring implementation and API +Address space mirroring implementation and API +============================================== Address space mirroring main objective is to allow to duplicate range of CPU page table into a device page table and HMM helps keeping both synchronize. A device driver that want to mirror a process address space must start with the -registration of an hmm_mirror struct: +registration of an hmm_mirror struct:: int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm); @@ -156,7 +148,7 @@ registration of an hmm_mirror struct: The locked variant is to be use when the driver is already holding the mmap_sem of the mm in write mode. The mirror struct has a set of callback that are use -to propagate CPU page table: +to propagate CPU page table:: struct hmm_mirror_ops { /* sync_cpu_device_pagetables() - synchronize page tables @@ -187,7 +179,8 @@ be done with the update. When device driver wants to populate a range of virtual address it can use -either: +either:: + int hmm_vma_get_pfns(struct vm_area_struct *vma, struct hmm_range *range, unsigned long start, @@ -211,7 +204,7 @@ that array correspond to an address in the virtual range. HMM provide a set of flags to help driver identify special CPU page table entries. Locking with the update() callback is the most important aspect the driver must -respect in order to keep things properly synchronize. The usage pattern is : +respect in order to keep things properly synchronize. The usage pattern is:: int driver_populate_range(...) { @@ -251,9 +244,8 @@ concurrently for multiple devices. Waiting for each device to report commands as executed is serialize (there is no point in doing this concurrently). -------------------------------------------------------------------------------- - -5) Represent and manage device memory from core kernel point of view +Represent and manage device memory from core kernel point of view +================================================================= Several differents design were try to support device memory. First one use device specific data structure to keep information about migrated memory and @@ -269,14 +261,14 @@ un-aware of the difference. We only need to make sure that no one ever try to map those page from the CPU side. HMM provide a set of helpers to register and hotplug device memory as a new -region needing struct page. This is offer through a very simple API: +region needing struct page. This is offer through a very simple API:: struct hmm_devmem *hmm_devmem_add(const struct hmm_devmem_ops *ops, struct device *device, unsigned long size); void hmm_devmem_remove(struct hmm_devmem *devmem); -The hmm_devmem_ops is where most of the important things are: +The hmm_devmem_ops is where most of the important things are:: struct hmm_devmem_ops { void (*free)(struct hmm_devmem *devmem, struct page *page); @@ -294,13 +286,12 @@ second callback happens whenever CPU try to access a device page which it can not do. This second callback must trigger a migration back to system memory. -------------------------------------------------------------------------------- - -6) Migrate to and from device memory +Migrate to and from device memory +================================= Because CPU can not access device memory, migration must use device DMA engine to perform copy from and to device memory. For this we need a new migration -helper: +helper:: int migrate_vma(const struct migrate_vma_ops *ops, struct vm_area_struct *vma, @@ -319,7 +310,7 @@ such migration base on range of address the device is actively accessing. The migrate_vma_ops struct define two callbacks. First one (alloc_and_copy()) control destination memory allocation and copy operation. Second one is there -to allow device driver to perform cleanup operation after migration. +to allow device driver to perform cleanup operation after migration:: struct migrate_vma_ops { void (*alloc_and_copy)(struct vm_area_struct *vma, @@ -353,9 +344,8 @@ bandwidth but this is considered as a rare event and a price that we are willing to pay to keep all the code simpler. -------------------------------------------------------------------------------- - -7) Memory cgroup (memcg) and rss accounting +Memory cgroup (memcg) and rss accounting +======================================== For now device memory is accounted as any regular page in rss counters (either anonymous if device page is use for anonymous, file if device page is use for -- 2.7.4 -- To unsubscribe from this list: send the line "unsubscribe linux-alpha" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html