On Wed, Nov 06, 2024 at 10:48:45AM +0100, Christian König wrote: > Am 05.11.24 um 19:35 schrieb Matthew Brost: > > [SNIP] > > > Well we spend quite some time removing single page mappings from device > > > drivers. > > > > > > The only remaining use case of ttm_bo_kmap() with just one page is the > > > ttm_bo_vm_access_kmap() function and I was really hoping to make that one > > > TTM internal at some point. > > > > > This is still static, right? I suppose this exposes this to the outside > > world though in another place. I asume there is a reason we can't use > > vmap in ttm_bo_vm_access? > > Well no, the point is we don't want to. > > There is a huge push from upstream to avoid using kmap/vmap if possible. > > > > > > You need a really good justification to bring that back. > > > > > > > > > The use case is EuDebugger requires essentially the same functionality > > > > as ptrace -> vm_access. > > > Then why don't you use ptrace in the first place? > > > > > I think the debugger speaks in GPU address space thus needs to access > > via the GPU VM -> BO, userptrs. > > Exactly that is strictly forbidden. You can't access userptrs through this. > My mistake for mentioning userptr—I clearly caused confusion. This patch itself has nothing to do with userptr; it is accessing a BO. In Xe, a userptr doesn't have a BO, unlike in AMDGPU, where you have BOs for userptrs. The above use case was an example of modifying a GPU program with breakpoints, speaking in GPU address space rather than CPU address space. Hence, we cannot use ptrace. Userptr is a possible example, but that access path in the code is different and, again, has nothing to do with BO access in this patch. > That's one of the major reasons why upstream has pushed back on using kmap > so massively. Userptr access is not part of this patch—it will be a separate code path, so this seemingly does not apply. > > Can you fully describe your use case? In other words what exactly is your > debugger trying to do? See above; I hope I've made this clearer. Also, I'm not really an expert on Eudebug, as I haven't been involved in the development aside from reviewing its interaction with the core of Xe. Any further explanation would likely require me to loop in a colleague. > > > > > TTM mapping non-contiguous VRAM doesn't work unless I'm blind. User BOs > > > > which the EuDebugger accesses can be non-contiguous, hence the new > > > > helper. > > > Then why don't you handle that inside the driver in the first place instead > > > of going through a TTM midlayer? > > > > > Well common code always seems like a good idea to me. Can do this if you > > insist though. > > > > What if I change my new helper ttm_bo_access to be based on vmap for > > SYSTEM / TT but honestly that seems wasteful too for a temporary > > access mapping. > > Well, I think we need to take a step back. The major question is what is > your use case and is that use case valid or causes security concerns. > > For example userptrs are imported anonymous pages the GPU has a DMA mapping > for. Re-mapping them into an user address space for debugging or even > accessing them through the ptrace interface is strictly forbidden. > > We already had people trying to do exactly that and it ended not well at > all. > Again, if we can focus on what this patch is doing—accessing a BO, not a userptr—I think that will help progress here. To bring things together: "There is a huge push from upstream to avoid using kmap/vmap if possible." How would you suggest accessing a BO then? kmap/vmap are used everywhere in the DRM subsystem to access BOs, so I’m failing to see the problem with adding a simple helper based on existing code. Matt > Regards, > Christian. > > > > > With this, I strongly prefer the code as is. > > > > Matt > > > > > Regards, > > > Christian. > > > > > > > Matt > > > > > > > > > Regards, > > > > > Christian. > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > Reported-by: Christoph Manszewski<christoph.manszewski@xxxxxxxxx> > > > > > > > > Suggested-by: Thomas Hellström<thomas.hellstrom@xxxxxxxxxxxxxxx> > > > > > > > > Signed-off-by: Matthew Brost<matthew.brost@xxxxxxxxx> > > > > > > > > Tested-by: Mika Kuoppala<mika.kuoppala@xxxxxxxxxxxxxxx> > > > > > > > > Reviewed-by: Matthew Auld<matthew.auld@xxxxxxxxx> > > > > > > > > --- > > > > > > > > drivers/gpu/drm/ttm/ttm_bo_util.c | 86 +++++++++++++++++++++++++++++++ > > > > > > > > drivers/gpu/drm/ttm/ttm_bo_vm.c | 65 +---------------------- > > > > > > > > include/drm/ttm/ttm_bo.h | 2 + > > > > > > > > 3 files changed, 89 insertions(+), 64 deletions(-) > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_util.c b/drivers/gpu/drm/ttm/ttm_bo_util.c > > > > > > > > index d939925efa81..77e760ea7193 100644 > > > > > > > > --- a/drivers/gpu/drm/ttm/ttm_bo_util.c > > > > > > > > +++ b/drivers/gpu/drm/ttm/ttm_bo_util.c > > > > > > > > @@ -919,3 +919,89 @@ s64 ttm_lru_walk_for_evict(struct ttm_lru_walk *walk, struct ttm_device *bdev, > > > > > > > > return progress; > > > > > > > > } > > > > > > > > + > > > > > > > > +static int ttm_bo_access_kmap(struct ttm_buffer_object *bo, > > > > > > > > + unsigned long offset, > > > > > > > > + void *buf, int len, int write) > > > > > > > > +{ > > > > > > > > + unsigned long page = offset >> PAGE_SHIFT; > > > > > > > > + unsigned long bytes_left = len; > > > > > > > > + int ret; > > > > > > > > + > > > > > > > > + /* Copy a page at a time, that way no extra virtual address > > > > > > > > + * mapping is needed > > > > > > > > + */ > > > > > > > > + offset -= page << PAGE_SHIFT; > > > > > > > > + do { > > > > > > > > + unsigned long bytes = min(bytes_left, PAGE_SIZE - offset); > > > > > > > > + struct ttm_bo_kmap_obj map; > > > > > > > > + void *ptr; > > > > > > > > + bool is_iomem; > > > > > > > > + > > > > > > > > + ret = ttm_bo_kmap(bo, page, 1, &map); > > > > > > > > + if (ret) > > > > > > > > + return ret; > > > > > > > > + > > > > > > > > + ptr = (void *)ttm_kmap_obj_virtual(&map, &is_iomem) + offset; > > > > > > > > + WARN_ON_ONCE(is_iomem); > > > > > > > > + if (write) > > > > > > > > + memcpy(ptr, buf, bytes); > > > > > > > > + else > > > > > > > > + memcpy(buf, ptr, bytes); > > > > > > > > + ttm_bo_kunmap(&map); > > > > > > > > + > > > > > > > > + page++; > > > > > > > > + buf += bytes; > > > > > > > > + bytes_left -= bytes; > > > > > > > > + offset = 0; > > > > > > > > + } while (bytes_left); > > > > > > > > + > > > > > > > > + return len; > > > > > > > > +} > > > > > > > > + > > > > > > > > +/** > > > > > > > > + * ttm_bo_access - Helper to access a buffer object > > > > > > > > + * > > > > > > > > + * @bo: ttm buffer object > > > > > > > > + * @offset: access offset into buffer object > > > > > > > > + * @buf: pointer to caller memory to read into or write from > > > > > > > > + * @len: length of access > > > > > > > > + * @write: write access > > > > > > > > + * > > > > > > > > + * Utility function to access a buffer object. Useful when buffer object cannot > > > > > > > > + * be easily mapped (non-contiguous, non-visible, etc...). > > > > > > > > + * > > > > > > > > + * Returns: > > > > > > > > + * @len if successful, negative error code on failure. > > > > > > > > + */ > > > > > > > > +int ttm_bo_access(struct ttm_buffer_object *bo, unsigned long offset, > > > > > > > > + void *buf, int len, int write) > > > > > > > > +{ > > > > > > > > + int ret; > > > > > > > > + > > > > > > > > + if (len < 1 || (offset + len) > bo->base.size) > > > > > > > > + return -EIO; > > > > > > > > + > > > > > > > > + ret = ttm_bo_reserve(bo, true, false, NULL); > > > > > > > > + if (ret) > > > > > > > > + return ret; > > > > > > > > + > > > > > > > > + switch (bo->resource->mem_type) { > > > > > > > > + case TTM_PL_SYSTEM: > > > > > > > > + fallthrough; > > > > > > > > + case TTM_PL_TT: > > > > > > > > + ret = ttm_bo_access_kmap(bo, offset, buf, len, write); > > > > > > > > + break; > > > > > > > > + default: > > > > > > > > + if (bo->bdev->funcs->access_memory) > > > > > > > > + ret = bo->bdev->funcs->access_memory > > > > > > > > + (bo, offset, buf, len, write); > > > > > > > > + else > > > > > > > > + ret = -EIO; > > > > > > > > + } > > > > > > > > + > > > > > > > > + ttm_bo_unreserve(bo); > > > > > > > > + > > > > > > > > + return ret; > > > > > > > > +} > > > > > > > > +EXPORT_SYMBOL(ttm_bo_access); > > > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c > > > > > > > > index 2c699ed1963a..20b1e5f78684 100644 > > > > > > > > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > > > > > > > > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > > > > > > > > @@ -366,45 +366,6 @@ void ttm_bo_vm_close(struct vm_area_struct *vma) > > > > > > > > } > > > > > > > > EXPORT_SYMBOL(ttm_bo_vm_close); > > > > > > > > -static int ttm_bo_vm_access_kmap(struct ttm_buffer_object *bo, > > > > > > > > - unsigned long offset, > > > > > > > > - uint8_t *buf, int len, int write) > > > > > > > > -{ > > > > > > > > - unsigned long page = offset >> PAGE_SHIFT; > > > > > > > > - unsigned long bytes_left = len; > > > > > > > > - int ret; > > > > > > > > - > > > > > > > > - /* Copy a page at a time, that way no extra virtual address > > > > > > > > - * mapping is needed > > > > > > > > - */ > > > > > > > > - offset -= page << PAGE_SHIFT; > > > > > > > > - do { > > > > > > > > - unsigned long bytes = min(bytes_left, PAGE_SIZE - offset); > > > > > > > > - struct ttm_bo_kmap_obj map; > > > > > > > > - void *ptr; > > > > > > > > - bool is_iomem; > > > > > > > > - > > > > > > > > - ret = ttm_bo_kmap(bo, page, 1, &map); > > > > > > > > - if (ret) > > > > > > > > - return ret; > > > > > > > > - > > > > > > > > - ptr = (uint8_t *)ttm_kmap_obj_virtual(&map, &is_iomem) + offset; > > > > > > > > - WARN_ON_ONCE(is_iomem); > > > > > > > > - if (write) > > > > > > > > - memcpy(ptr, buf, bytes); > > > > > > > > - else > > > > > > > > - memcpy(buf, ptr, bytes); > > > > > > > > - ttm_bo_kunmap(&map); > > > > > > > > - > > > > > > > > - page++; > > > > > > > > - buf += bytes; > > > > > > > > - bytes_left -= bytes; > > > > > > > > - offset = 0; > > > > > > > > - } while (bytes_left); > > > > > > > > - > > > > > > > > - return len; > > > > > > > > -} > > > > > > > > - > > > > > > > > int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, > > > > > > > > void *buf, int len, int write) > > > > > > > > { > > > > > > > > @@ -412,32 +373,8 @@ int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr, > > > > > > > > unsigned long offset = (addr) - vma->vm_start + > > > > > > > > ((vma->vm_pgoff - drm_vma_node_start(&bo->base.vma_node)) > > > > > > > > << PAGE_SHIFT); > > > > > > > > - int ret; > > > > > > > > - > > > > > > > > - if (len < 1 || (offset + len) > bo->base.size) > > > > > > > > - return -EIO; > > > > > > > > - ret = ttm_bo_reserve(bo, true, false, NULL); > > > > > > > > - if (ret) > > > > > > > > - return ret; > > > > > > > > - > > > > > > > > - switch (bo->resource->mem_type) { > > > > > > > > - case TTM_PL_SYSTEM: > > > > > > > > - fallthrough; > > > > > > > > - case TTM_PL_TT: > > > > > > > > - ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write); > > > > > > > > - break; > > > > > > > > - default: > > > > > > > > - if (bo->bdev->funcs->access_memory) > > > > > > > > - ret = bo->bdev->funcs->access_memory( > > > > > > > > - bo, offset, buf, len, write); > > > > > > > > - else > > > > > > > > - ret = -EIO; > > > > > > > > - } > > > > > > > > - > > > > > > > > - ttm_bo_unreserve(bo); > > > > > > > > - > > > > > > > > - return ret; > > > > > > > > + return ttm_bo_access(bo, offset, buf, len, write); > > > > > > > > } > > > > > > > > EXPORT_SYMBOL(ttm_bo_vm_access); > > > > > > > > diff --git a/include/drm/ttm/ttm_bo.h b/include/drm/ttm/ttm_bo.h > > > > > > > > index 5804408815be..8ea11cd8df39 100644 > > > > > > > > --- a/include/drm/ttm/ttm_bo.h > > > > > > > > +++ b/include/drm/ttm/ttm_bo.h > > > > > > > > @@ -421,6 +421,8 @@ void ttm_bo_unpin(struct ttm_buffer_object *bo); > > > > > > > > int ttm_bo_evict_first(struct ttm_device *bdev, > > > > > > > > struct ttm_resource_manager *man, > > > > > > > > struct ttm_operation_ctx *ctx); > > > > > > > > +int ttm_bo_access(struct ttm_buffer_object *bo, unsigned long offset, > > > > > > > > + void *buf, int len, int write); > > > > > > > > vm_fault_t ttm_bo_vm_reserve(struct ttm_buffer_object *bo, > > > > > > > > struct vm_fault *vmf); > > > > > > > > vm_fault_t ttm_bo_vm_fault_reserved(struct vm_fault *vmf, > > > > > > > > -- > > > > > > > > 2.34.1 > > > > > > > >