On Thu, 2025-03-13 at 13:57 +0100, Christian König wrote: > Am 13.03.25 um 13:50 schrieb Thomas Hellström: > > Hi, Christian > > > > On Thu, 2025-03-13 at 11:19 +0100, Christian König wrote: > > > Am 12.03.25 um 22:03 schrieb Thomas Hellström: > > > > This RFC implements and requests comments for a way to handle > > > > SVM > > > > with multi-device, > > > > typically with fast interconnects. It adds generic code and > > > > helpers > > > > in drm, and > > > > device-specific code for xe. > > > > > > > > For SVM, devices set up maps of device-private struct pages, > > > > using > > > > a struct dev_pagemap, > > > > The CPU virtual address space (mm), can then be set up using > > > > special page-table entries > > > > to point to such pages, but they can't be accessed directly by > > > > the > > > > CPU, but possibly > > > > by other devices using a fast interconnect. This series aims to > > > > provide helpers to > > > > identify pagemaps that take part in such a fast interconnect > > > > and to > > > > aid in migrating > > > > between them. > > > > > > > > This is initially done by augmenting the struct dev_pagemap > > > > with a > > > > struct drm_pagemap, > > > > and having the struct drm_pagemap implement a "populate_mm" > > > > method, > > > > where a region of > > > > the CPU virtual address space (mm) is populated with > > > > device_private > > > > pages from the > > > > dev_pagemap associated with the drm_pagemap, migrating data > > > > from > > > > system memory or other > > > > devices if necessary. The drm_pagemap_populate_mm() function is > > > > then typically called > > > > from a fault handler, using the struct drm_pagemap pointer of > > > > choice. It could be > > > > referencing a local drm_pagemap or a remote one. The migration > > > > is > > > > now completely done > > > > by drm_pagemap callbacks, (typically using a copy-engine local > > > > to > > > > the dev_pagemap local > > > > memory). > > > Up till here that makes sense. Maybe not necessary to be put into > > > the > > > DRM layer, but that is an implementation detail. > > > > > > > In addition there are helpers to build a drm_pagemap UAPI using > > > > file-descripors > > > > representing struct drm_pagemaps, and a helper to register > > > > devices > > > > with a common > > > > fast interconnect. The UAPI is intended to be private to the > > > > device, but if drivers > > > > agree to identify struct drm_pagemaps by file descriptors one > > > > could > > > > in theory > > > > do cross-driver multi-device SVM if a use-case were found. > > > But this completely eludes me. > > > > > > Why would you want an UAPI for representing pagemaps as file > > > descriptors? Isn't it the kernel which enumerates the > > > interconnects > > > of the devices? > > > > > > I mean we somehow need to expose those interconnects between > > > devices > > > to userspace, e.g. like amdgpu does with it's XGMI connectors. > > > But > > > that is static for the hardware (unless HW is hot removed/added) > > > and > > > so I would assume exposed through sysfs. > > Thanks for the feedback. > > > > The idea here is not to expose the interconnects but rather have a > > way > > for user-space to identify a drm_pagemap and some level of access- > > and > > lifetime control. > > Well that's what I get I just don't get why? > > I mean when you want to have the pagemap as optional feature you can > turn on and off I would say make that a sysfs file. > > It's a global feature anyway and not bound in any way to the file > descriptor, isn't it? Getting back on this we had some discussions internally on this and the desired behavior is to have the device-private pages on a firstopen- lastclose lifetime (Or rather firstopen-(lastclose + shrinker)) lifetime, for memory usage concerns. So I believe a file descriptor is a good fit for the UAPI representation. Thanks, Thomas