Re: vm binding interfaces and parallel with mmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 22.08.22 um 10:34 schrieb Bas Nieuwenhuizen:
On Mon, Aug 22, 2022 at 9:28 AM Dave Airlie <airlied@xxxxxxxxx> wrote:
On Mon, 22 Aug 2022 at 17:05, Dave Airlie <airlied@xxxxxxxxx> wrote:
Hey,

I've just been looking at the vm bind type interfaces and wanted to at
least document how we think the unmapping API should work. I know I've
talked on irc before about this, but wanted to solidify things a bit
more around what is required vs what is a nice to have.

My main concerns/thoughts are around the unbind interfaces and how
close to munmap they should be.

I think the mapping operation is mostly consistent
MAP(bo handle, offset into bo, range, VM offset, VM flags)
which puts the range inside to bo at the offset in the current VM
(maybe take an optional vm_id).

now the simplest unmap I can see if one that parallel munmap
UNMAP(vmaddr, range);

But it begs the question on then how much the kernel needs to deal
with here, if we support random vmaddr,range then we really need to be
able to do everything munmap does for CPU VMA, which means splitting
ranges, joining ranges etc.

like
MAP(1, 0, 0x8000, 0xc0000)
UNMAP(0xc1000, 0x1000)
should that be possible?

Do we have any API usage (across Vulkan/CL/CUDA/ROCm etc) that
requires this sort of control, or should we be fine with only
unmapping objects exactly like how they were mapped in the first
place, and not have any splitting/joining?
Vulkan allows for this, though I haven't checked to what extent apps use it.

This is massively used for partial resident textures under OpenGL as far as I know.

E.g. you map a range like 1->10 as PRT and then then map real textures at 2, 5 and 7 or something like that.

Saying that a functionality to map/enable PRT for a range is necessary as well. On amdgpu we have a special flag for that and in this case the BO to map can be NULL.

We could technically split all mapping/unmapping to be per single tile
in the userspace driver, which avoids the need for splitting/merging,
but that could very much be a pessimization.

That would be pretty much a NAK from my side. A couple of hardware optimizations require mappings to be as large as possible.

Otherwise we wouldn't be able to use huge/giant (2MiB, 1GiB) pages, power of two TLB reach optimizations (8KiB, 16KiB, 32KiB.....) as well as texture fetcher optimizations.

I suppose it also asks the question around paralleling

fd = open()
ptr = mmap(fd,)
close(fd)
the mapping is still valid.

I suppose our equiv is
handle = bo_alloc()
gpu_addr = vm_bind(handle,)
gem_close(handle)
is the gpu_addr still valid does the VM hold a reference on the kernel
bo internally.
For Vulkan it looks like this is undefined and the above is not necessary:

"It is important to note that freeing a VkDeviceMemory object with
vkFreeMemory will not cause resources (or resource regions) bound to
the memory object to become unbound. Applications must not access
resources bound to memory that has been freed."
(32.7.6)

Additional to what was discussed here so far we need an array on in and out drm_syncobj for both map as well as unmap.

E.g. when the mapping/unmapping should happen and when it is completed etc...

Christian.



Dave.
Dave.




[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux