On Tue, Jul 01, 2014 at 08:14:47PM +0200, Christian König wrote: > Am 01.07.2014 17:35, schrieb Jerome Glisse: > >On Tue, Jul 01, 2014 at 11:49:18AM +0200, Christian König wrote: > >>Am 30.06.2014 19:35, schrieb Jerome Glisse: > >>>On Mon, Jun 30, 2014 at 03:04:16PM +0200, Christian König wrote: > >>> > >>>[SNIP] > >>>@@ -176,8 +181,15 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p) > >>> if (p->cs_flags & RADEON_CS_USE_VM) > >>> p->vm_bos = radeon_vm_get_bos(p->rdev, p->ib.vm, > >>> &p->validated); > >>>+ if (need_mmap_lock) > >>>+ down_read(¤t->mm->mmap_sem); > >>>+ > >>>+ r = radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p->ring); > >>>- return radeon_bo_list_validate(p->rdev, &p->ticket, &p->validated, p->ring); > >>>+ if (need_mmap_lock) > >>>+ up_read(¤t->mm->mmap_sem); > >>>+ > >>>+ return r; > >>> } > >>>So the mmap lock protect almost nothing here. Once things are validated > >>>nothing forbid the userspace to unmap the range containing the user bo. > >>As far as I understand it (and I'm probably not so deep into the MM > >>code as you, so correct me if I'm wrong) unmapping the range would > >>result in an invalidate_range_start callback and this callback in > >>turn validates the BO into the CPU domain again. So it actually > >>blocks for the GPU operation to be completed, marks the pages in > >>question as dirty and then unpins them. > >> > >>This should protected us anything nasty to happen in the case of a > >>unmap(), fork() etc... > >Thing is nothing forbid from a new cs ioctl to happen right after > >the radeon range_start callback but before the core mm code that > >was about to do something. The mmap_sem will protect you from fork > >or munmap but not from otherthing. > > Any suggestion on how to fix this? > > I mean I could take the BO reservation in range_start and release it > in range_end, but I'm pretty sure the code calling the MMU notifier > won't like that. I need to think a bit more about all the case other than munmap/fork to see what might happen and under what circumstances. > > [SNIP] > >>>>+ > >>>>+static const struct mmu_notifier_ops radeon_bo_notifier = { > >>>>+ .release = radeon_bo_mn_release, > >>>>+ .invalidate_range_start = radeon_bo_mn_invalidate_range_start, > >>>>+}; > >>>What about invalidate_page callback ? You are breaking write back to > >>>disk. Which is not a welcome move. > >>Why? Keep in mind that the pages are pinned down as soon as we make > >>the first CS which uses them and marked as dirty when we are done > >>with them. In between those two events the page shouldn't be written > >>back to disk. > >Page could be dirty prior to be pin, or dirtied after being (CPU access). > >I am just pointing out that the GPU might be writting to the page while > >the page is use as source for disk I/O. > > I still don't see the problem here, the worst thing that could > happen is that we write a halve filled page to disk. But since we > set the dirty bit again after the GPU is done with the pages it > should be written back to disk again if necessary. This break the fsync semantic and expectation. This is the reason why the cpu mapping is set to read only while disk i/o is in progress. > > [SNIP] > >The biggest issue with it is the pining of memory, this violate the memcg. > >Direct-io pin memory but limit itself to a reasonable amount at a time. > >A GPU buffer object might be several hundred mega byte which obviously > >can be nasty from memory starvation point of view. > > I thought we would handle this gracefully by accounting the memory > pinned down to the GTT size as well. E.g. allocating GTT buffers > pins down memory in the same way we would pin it down through this > interface. > > In the end the maximum amount of pinned down memory is always the GTT size. Yes this could be argued that way, that anyway user than can use the gpu can already pin large amount of ram. But pining anonymous memory or file backed memory (ie non regular bo memory) is different as pages are still on the lru list and thus the vmscan code will still go over them which might worsen the out of memory case. > > Regards, > Christian. > > > > >Most other issue can be worked out. > > > >Cheers, > >Jérôme > > > _______________________________________________ dri-devel mailing list dri-devel@xxxxxxxxxxxxxxxxxxxxx http://lists.freedesktop.org/mailman/listinfo/dri-devel