Am 26.03.2018 um 17:42 schrieb Jerome Glisse:
On Mon, Mar 26, 2018 at 10:01:21AM +0200, Daniel Vetter wrote:
On Thu, Mar 22, 2018 at 10:58:55AM +0100, Christian König wrote:
Am 22.03.2018 um 08:18 schrieb Daniel Vetter:
[SNIP]
Key take away from that was that you can't take any locks from neither the
MMU notifier nor the shrinker you also take while calling kmalloc (or
simpler speaking get_user_pages()).
Additional to that in the MMU or shrinker callback all different kinds of
locks might be held, so you basically can't assume that you do thinks like
recursive page table walks or call dma_unmap_anything.
That sounds like a design bug in mmu_notifiers, since it would render them
useless for KVM. And they were developed for that originally. I think I'll
chat with Jerome to understand this, since it's all rather confusing.
Doing dma_unmap() during mmu_notifier callback should be fine, it was last
time i check. However there is no formal contract that it is ok to do so.
As I said before dma_unmap() isn't the real problem here.
The issues is more that you can't take a lock in the MMU notifier which
you would also take while allocating memory without GFP_NOIO.
That makes it rather tricky to do any command submission, e.g. you need
to grab all the pages/memory/resources prehand, then make sure that you
don't have a MMU notifier running concurrently and do the submission.
If any of the prerequisites isn't fulfilled we need to restart the
operation.
[SNIP]
A slightly better solution is using atomic counter:
driver_range_start() {
atomic_inc(&mydev->notifier_count);
...
Yeah, that is exactly what amdgpu is doing now. Sorry if my description
didn't made that clear.
I would like to see driver using same code, as it means one place to fix
issues. I had for a long time on my TODO list doing the above conversion
to amd or radeon kernel driver. I am pushing up my todo list hopefully in
next few weeks i can send an rfc so people can have a real sense of how
it can look.
Certainly a good idea, but I think we might have that separate to HMM.
TTM suffered really from feature overload, e.g. trying to do everything
in a single subsystem. And it would be rather nice to have coherent
userptr handling for GPUs as separate feature.
Regards,
Christian.