On 11/25/2020 10:02 AM, Jason Gunthorpe wrote:
On Mon, Nov 23, 2020 at 11:50:24AM -0500, Dennis Dalessandro wrote:
@@ -133,8 +121,16 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
unsigned long flags;
struct list_head del_list;
+ /*
+ * do_exit() calls exit_mm() before exit_files() which would call close
+ * and end up in here. If there is no mm, then its a kernel thread and
+ * we need to let it continue the removal.
+ */
+ if (current->mm && (handler->mn.mm != current->mm))
+ return;
+
/* Unregister first so we don't get any more notifications. */
- mmu_notifier_unregister(&handler->mn, handler->mm);
+ mmu_notifier_unregister(&handler->mn, handler->mn.mm);
This logic cannot be right.. The only caller does:
if (pq->handler)
hfi1_mmu_rb_unregister(pq->handler);
[..]
kfree(pq);
So this is leaking the mmu_notifier registration if the user manages
to trigger hfi1_user_sdma_free_queues() from another process.
Since hfi1_user_sdma_free_queues() is called from close() it doesn't
look OK.
When the object that creates the notifier is destroyed the notifier
should be deleted unconditionally.
Only accesses to a VA should be qualified to ensure that a notifier is
registered on current->mm before touching the VA.
Ah yes. I think this just all goes away then. The context init and pq
allocation is what triggers the registration, whenever we tear it down
it should do the de-registration. v5 coming up after running tests.
-Denny