On 16/07/2019 13:49, Chris Wilson wrote:
Following a try_to_unmap() we may want to remove the userptr and so call
put_pages(). However, try_to_unmap() acquires the page lock and so we
must avoid recursively locking the pages ourselves -- which means that
we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.
So if trylock randomly fail we get data corruption in whatever data set
application is working on, which is what the original patch was trying
to avoid? Are we able to detect the backing store type so at least we
don't risk skipping set_page_dirty with anonymous/shmemfs?
Regards,
Tvrtko
Reported-by: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx>
Fixes: cb6d7c7dc7ff ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
Signed-off-by: Chris Wilson <chris@xxxxxxxxxxxxxxxxxx>
Cc: Lionel Landwerlin <lionel.g.landwerlin@xxxxxxxxx>
Cc: Tvrtko Ursulin <tvrtko.ursulin@xxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
drivers/gpu/drm/i915/gem/i915_gem_userptr.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
index b9d2bb15e4a6..1ad2047a6dbd 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_userptr.c
@@ -672,7 +672,7 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
obj->mm.dirty = false;
for_each_sgt_page(page, sgt_iter, pages) {
- if (obj->mm.dirty)
+ if (obj->mm.dirty && trylock_page(page)) {
/*
* As this may not be anonymous memory (e.g. shmem)
* but exist on a real mapping, we have to lock
@@ -680,8 +680,20 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
* the page reference is not sufficient to
* prevent the inode from being truncated.
* Play safe and take the lock.
+ *
+ * However...!
+ *
+ * The mmu-notifier can be invalidated for a
+ * migrate_page, that is alreadying holding the lock
+ * on the page. Such a try_to_unmap() will result
+ * in us calling put_pages() and so recursively try
+ * to lock the page. We avoid that deadlock with
+ * a trylock_page() and in exchange we risk missing
+ * some page dirtying.
*/
- set_page_dirty_lock(page);
+ set_page_dirty(page);
+ unlock_page(page);
+ }
mark_page_accessed(page);
put_page(page);