On Fri, Aug 30, 2024 at 11:16:53AM +0200, Thomas Hellström wrote: > Hi, Matthew > > On Tue, 2024-08-27 at 19:48 -0700, Matthew Brost wrote: > > +/** > > + * DOC: Overview > > + * > > + * GPU Shared Virtual Memory (GPU SVM) layer for the Direct > > Rendering Manager (DRM) > > + * > > + * The GPU SVM layer is a component of the DRM framework designed to > > manage shared > > + * virtual memory between the CPU and GPU. It enables efficient data > > exchange and > > + * processing for GPU-accelerated applications by allowing memory > > sharing and > > + * synchronization between the CPU's and GPU's virtual address > > spaces. > > + * > > + * Key GPU SVM Components: > > + * - Notifiers: Notifiers: Used for tracking memory intervals and > > notifying the > > + * GPU of changes, notifiers are sized based on a GPU > > SVM > > + * initialization parameter, with a recommendation of > > 512M or > > + * larger. They maintain a Red-BlacK tree and a list of > > ranges that > > + * fall within the notifier interval. Notifiers are > > tracked within > > + * a GPU SVM Red-BlacK tree and list and are > > dynamically inserted > > + * or removed as ranges within the interval are created > > or > > + * destroyed. > > + * - Ranges: Represent memory ranges mapped in a DRM device and > > managed > > + * by GPU SVM. They are sized based on an array of chunk > > sizes, which > > + * is a GPU SVM initialization parameter, and the CPU > > address space. > > + * Upon GPU fault, the largest aligned chunk that fits > > within the > > + * faulting CPU address space is chosen for the range > > size. Ranges are > > + * expected to be dynamically allocated on GPU fault and > > removed on an > > + * MMU notifier UNMAP event. As mentioned above, ranges > > are tracked in > > + * a notifier's Red-Black tree. > > + * - Operations: Define the interface for driver-specific SVM > > operations such as > > + * allocation, page collection, migration, > > invalidations, and VRAM > > + * release. > > + * > > Another question, since ranges, as I understand it, are per gpuvm and > per cpu mm, whereas migration is per device and per cpu_mm, (whe might > have multiple gpuvms mapping the same cpu_mm), I figure the gpu_svm is > per gpuvm, but that makes migration currently inconsistent, right? I think anything that tracks va must be 1:1 tied to the single specific cpu mm that we use for hmm/svm. So I think that's ok. There's a pile of paths where that 1:1 mapping doesn't capture the entire picture. but I think there the right choice is to just completely ignore any cpu/gpu mm/vma stuff, and defacto rely on the core mm rmap datastructure to make sure we find them all (e.g. to update/invalidate ptes during migration). -Sima -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch