On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote: > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma, unsigned long address, > * through poll/read(). > */ > __add_wait_queue(&ctx->fault_wqh, &uwq.wq); > - for (;;) { > - set_current_state(TASK_KILLABLE); > - if (!uwq.pending || ACCESS_ONCE(ctx->released) || > - fatal_signal_pending(current)) > - break; > - spin_unlock(&ctx->fault_wqh.lock); > + set_current_state(TASK_KILLABLE); > + spin_unlock(&ctx->fault_wqh.lock); > > + if (likely(!ACCESS_ONCE(ctx->released) && > + !fatal_signal_pending(current))) { > wake_up_poll(&ctx->fd_wqh, POLLIN); > schedule(); > + ret |= VM_FAULT_MAJOR; > + } So what happens here if schedule() spontaneously wakes for no reason? I'm not sure enough of userfaultfd semantics to say if that would be bad, but the code looks suspiciously like it relies on schedule() not to do that; which is wrong. > + __set_current_state(TASK_RUNNING); > + /* see finish_wait() comment for why list_empty_careful() */ > + if (!list_empty_careful(&uwq.wq.task_list)) { > spin_lock(&ctx->fault_wqh.lock); > + list_del_init(&uwq.wq.task_list); > + spin_unlock(&ctx->fault_wqh.lock); > } > - __remove_wait_queue(&ctx->fault_wqh, &uwq.wq); > - __set_current_state(TASK_RUNNING); > - spin_unlock(&ctx->fault_wqh.lock); > > /* > * ctx may go away after this if the userfault pseudo fd is -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html