[PATCH 22/24] drm/amdkfd: Adding new IOCTL for scratch memory v2

felix.kuehling@xxxxxxx (Felix Kuehling) · Mon, 21 Aug 2017 15:32:09 -0400

On 2017-08-21 01:39 PM, Jerome Glisse wrote:
> On Tue, Aug 15, 2017 at 11:00:20PM -0400, Felix Kuehling wrote:
>> From: Moses Reuben <moses.reuben at amd.com>
>>
>> v2:
>> * Renamed ALLOC_MEMORY_OF_SCRATCH to SET_SCRATCH_BACKING_VA
>> * Removed size parameter from the ioctl, it was unused
>> * Removed hole in ioctl number space
>> * No more call to write_config_static_mem
>> * Return correct error code from ioctl
> What kind of memory is suppose to back this virtual address
> range ? How big is the range suppose to be ? Can it be any
> valid virtual address ?

Yes.

>
> My worry here is to ascertain that one can not abuse this
> ioctl say to set the virtual address to some mmaped shared
> library code/data section and write something malicious
> there.

The memory is both read and written by the GPU. It's just data used by
the compute shaders. You can't really use this to do something
malicious, because the code that accesses this memory runs without
special privileges. If you point this to some other memory (shared
library code or data), it's no more dangerous than a random memcpy or
memset to that address.

It's the user mode driver's responsibility to allocate a big enough
address range and provide the virtual address to KFD.

>
> I am assuming that if it has to go through ATS/PASID of the
> IOMMUv2
It does on APUs. On dGPUs this memory is accessed by the GPU through
GPUVM and it can be backed either by system memory or VRAM. In our
current implementation we use VRAM for performance reasons.

>  then the write protection will be asserted and we
> will see proper COW (copy on write) due to mmap PRIVATE flags.

If nothing is mapped there, the access will fail. I think you'd only get
COW in very special situations, e.g. after a fork.

> Idealy this area should be a special vma and the driver
> should track its lifetime and cancel GPU jobs if it is
> unmap.

There is nothing special about this memory. If it's unmapped, any
subsequent accesses by the GPU will cause a segfault (on APU) or a GPUVM
fault (on dGPU).

>  But i am unsure on how dynamic is that scratch
> memory suppose to be (ie do you allocate new scratch memory
> with every GPU job or is it allocated once and reuse for
> every jobs).

It's allocated once when the HSA runtime initialized. The runtime
suballocates the scratch backing memory per queue. This relies on the
fact that the kernel won't commit actual physical pages until the first
access.

On dGPUs it only allocates address space at initialization. When the
runtime suballocates a region, the actual VRAM for it is allocated. So
we can allocate 4GB of scratch backing virtual address space without
consuming any VRAM until it actually gets used.

Regards,
  Felix

> Bigger commit message would be nice too. Like i had tons
> of i believe valid questions.
>
> Cheers,
> JÃ©rÃ´me