Re: [PATCH] drm/radeon: add user pointer support v2

Christian König <deathsimple@xxxxxxxxxxx> · Wed, 02 Jul 2014 18:44:30 +0200




Yes this could be argued that way, that anyway user than can use the gpu
can already pin large amount of ram. But pining anonymous memory or file
backed memory (ie non regular bo memory) is different as pages are still
on the lru list and thus the vmscan code will still go over them which
might worsen the out of memory case.
Assuming we forbid file backed pages and only allow anonymous memory for 
this feature, wouldn't it be possible to "steal" the pages in question 
from the lru and add them as backing pages to an BO?

For userspace it should look like they just created a BO, copied the 
content of the memory location in question into it and then mapped the 
BO at the original location (obviously without the race condition that 
such a thing would imply for doing it in userspace).

Just a thought,
Christian.

Am 01.07.2014 21:05, schrieb Jerome Glisse:
On Tue, Jul 01, 2014 at 08:14:47PM +0200, Christian König wrote:
Am 01.07.2014 17:35, schrieb Jerome Glisse:
On Tue, Jul 01, 2014 at 11:49:18AM +0200, Christian König wrote:
Am 30.06.2014 19:35, schrieb Jerome Glisse:
On Mon, Jun 30, 2014 at 03:04:16PM +0200, Christian König wrote:

[SNIP]
@@ -176,8 +181,15 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
  	if (p->cs_flags & RADEON_CS_USE_VM)
  		p->vm_bos = radeon_vm_get_bos(p->rdev, p->ib.vm,
  					      &p->validated);
+	if (need_mmap_lock)
+		down_read(&current->mm->mmap_sem);
+
+	r = radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p->ring);
-	return radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p->ring);
+	if (need_mmap_lock)
+		up_read(&current->mm->mmap_sem);
+
+	return r;
  }
So the mmap lock protect almost nothing here. Once things are validated
nothing forbid the userspace to unmap the range containing the user bo.
As far as I understand it (and I'm probably not so deep into the MM
code as you, so correct me if I'm wrong) unmapping the range would
result in an invalidate_range_start callback and this callback in
turn validates the BO into the CPU domain again. So it actually
blocks for the GPU operation to be completed, marks the pages in
question as dirty and then unpins them.

This should protected us anything nasty to happen in the case of a
unmap(), fork() etc...
Thing is nothing forbid from a new cs ioctl to happen right after
the radeon range_start callback but before the core mm code that
was about to do something. The mmap_sem will protect you from fork
or munmap but not from otherthing.
Any suggestion on how to fix this?

I mean I could take the BO reservation in range_start and release it
in range_end, but I'm pretty sure the code calling the MMU notifier
won't like that.
I need to think a bit more about all the case other than munmap/fork
to see what might happen and under what circumstances.

[SNIP]
+
+static const struct mmu_notifier_ops radeon_bo_notifier = {
+	.release = radeon_bo_mn_release,
+	.invalidate_range_start = radeon_bo_mn_invalidate_range_start,
+};
What about invalidate_page callback ? You are breaking write back to
disk. Which is not a welcome move.
Why? Keep in mind that the pages are pinned down as soon as we make
the first CS which uses them and marked as dirty when we are done
with them. In between those two events the page shouldn't be written
back to disk.
Page could be dirty prior to be pin, or dirtied after being (CPU access).
I am just pointing out that the GPU might be writting to the page while
the page is use as source for disk I/O.
I still don't see the problem here, the worst thing that could
happen is that we write a halve filled page to disk. But since we
set the dirty bit again after the GPU is done with the pages it
should be written back to disk again if necessary.
This break the fsync semantic and expectation. This is the reason why the
cpu mapping is set to read only while disk i/o is in progress.

[SNIP]
The biggest issue with it is the pining of memory, this violate the memcg.
Direct-io pin memory but limit itself to a reasonable amount at a time.
A GPU buffer object might be several hundred mega byte which obviously
can be nasty from memory starvation point of view.
I thought we would handle this gracefully by accounting the memory
pinned down to the GTT size as well. E.g. allocating GTT buffers
pins down memory in the same way we would pin it down through this
interface.

In the end the maximum amount of pinned down memory is always the GTT size.
Yes this could be argued that way, that anyway user than can use the gpu
can already pin large amount of ram. But pining anonymous memory or file
backed memory (ie non regular bo memory) is different as pages are still
on the lru list and thus the vmscan code will still go over them which
might worsen the out of memory case.

Regards,
Christian.

Most other issue can be worked out.

Cheers,
Jérôme


_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel