Re: [PATCH 14/23] userfaultfd: wake pending userfaults

Peter Zijlstra <peterz@xxxxxxxxxxxxx> · Thu, 22 Oct 2015 14:10:56 +0200

On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote:
> @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma, unsigned long address,
>  	 * through poll/read().
>  	 */
>  	__add_wait_queue(&ctx->fault_wqh, &uwq.wq);
> -	for (;;) {
> -		set_current_state(TASK_KILLABLE);
> -		if (!uwq.pending || ACCESS_ONCE(ctx->released) ||
> -		    fatal_signal_pending(current))
> -			break;
> -		spin_unlock(&ctx->fault_wqh.lock);
> +	set_current_state(TASK_KILLABLE);
> +	spin_unlock(&ctx->fault_wqh.lock);
>  
> +	if (likely(!ACCESS_ONCE(ctx->released) &&
> +		   !fatal_signal_pending(current))) {
>  		wake_up_poll(&ctx->fd_wqh, POLLIN);
>  		schedule();
> +		ret |= VM_FAULT_MAJOR;
> +	}

So what happens here if schedule() spontaneously wakes for no reason?

I'm not sure enough of userfaultfd semantics to say if that would be
bad, but the code looks suspiciously like it relies on schedule() not to
do that; which is wrong.

> +	__set_current_state(TASK_RUNNING);
> +	/* see finish_wait() comment for why list_empty_careful() */
> +	if (!list_empty_careful(&uwq.wq.task_list)) {
>  		spin_lock(&ctx->fault_wqh.lock);
> +		list_del_init(&uwq.wq.task_list);
> +		spin_unlock(&ctx->fault_wqh.lock);
>  	}
> -	__remove_wait_queue(&ctx->fault_wqh, &uwq.wq);
> -	__set_current_state(TASK_RUNNING);
> -	spin_unlock(&ctx->fault_wqh.lock);
>  
>  	/*
>  	 * ctx may go away after this if the userfault pseudo fd is
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html