RE: [RESEND RFC PATCH 4/5] mm/remote_mapping: use a pidfd to access memory belonging to unrelated process

Mircea CIRJALIU - MELIU <mcirjaliu@xxxxxxxxxxxxxxx> · Wed, 9 Sep 2020 08:32:42 +0000

> From: Oleg Nesterov <oleg@xxxxxxxxxx>
> 
> it seems that nobody is going to review this patch ;)
> 
> So I tried to read mirror_vm_fault() and the usage of mmap_sem doesn't
> look right to me. But let me repeat, this is not my area I can be easily wrong,
> please correct me.
> 
> On 09/04, Adalbert Lazăr wrote:
> >
> > +static vm_fault_t mirror_vm_fault(struct vm_fault *vmf) {
> > +	struct vm_area_struct *vma = vmf->vma;
> > +	struct mm_struct *mm = vma->vm_mm;
> > +	struct remote_vma_context *ctx = vma->vm_private_data;
> > +	struct remote_view *view = ctx->view;
> > +	struct file *file = vma->vm_file;
> > +	struct remote_file_context *fctx = file->private_data;
> > +	unsigned long req_addr;
> > +	unsigned int gup_flags;
> > +	struct page *req_page;
> > +	vm_fault_t result = VM_FAULT_SIGBUS;
> > +	struct mm_struct *src_mm = fctx->mm;
> > +	unsigned long seq;
> > +	int idx;
> > +
> > +fault_retry:
> > +	seq = mmu_interval_read_begin(&view->mmin);
> > +
> > +	idx = srcu_read_lock(&fctx->fault_srcu);
> > +
> > +	/* check if view was invalidated */
> > +	if (unlikely(!READ_ONCE(view->valid))) {
> > +		pr_debug("%s: region [%lx-%lx) was invalidated!!\n",
> __func__,
> > +			view->offset, view->offset + view->size);
> > +		goto out_invalid;		/* VM_FAULT_SIGBUS */
> > +	}
> > +
> > +	/* drop current mm semapchore */
> > +	up_read(&current->mm->mmap_sem);
> 
> Please use mmap_read_lock/unlock(mm) instead of
> down/up_read(mmap_sem).
> 
> But why is it safe to drop ->mmap_sem without checking
> FAULT_FLAG_ALLOW_RETRY/RETRY_NOWAIT ?
> 
Dropping mmap_sem will have the same effects regardless of FAULT_FLAG_ALLOW_RETRY/RETRY_NOWAIT.
Another thread can unmap the VMA from underneath us, or remap another one in its place.
In the end, the VMA has to be revalidated when re-acquiring the mmap_sem.
Or am I wrong?!

The reason I dropped mmap_sem is cause I had to acquire the remote mm->mmap_sem 
and tried to avoid any nested semaphore scenarios.
AFAIK the faulting rules allow us to return with mmap_sem dropped when FAULT_FLAG_ALLOW_RETRY
is set, but state nothing about dropping and re-acquiring mmap_sem in the fault handler.

> > +	/* take remote mm semaphore */
> > +	if (vmf->flags & FAULT_FLAG_ALLOW_RETRY) {
> > +		if (!down_read_trylock(&src_mm->mmap_sem)) {
> 
> I don't understand this... perhaps you meant FAULT_FLAG_RETRY_NOWAIT ?

Seems you're right. I never understood the definition of FAULT_FLAG_RETRY_NOWAIT.
@FAULT_FLAG_RETRY_NOWAIT: Don't drop mmap_sem and wait when retrying.
Should have been: Don't drop mmap_sem and don't wait when retrying.
And 'Don't drop mmap_sem' is supposed to mean:
/*
  * NOTE! This will make us return with VM_FAULT_RETRY, but with
  * the mmap_sem still held. That's how FAULT_FLAG_RETRY_NOWAIT
  * is supposed to work. We have way too many special cases..
  */
:~(

> > +	 * If FAULT_FLAG_ALLOW_RETRY is set, the mmap_sem must be
> released
> > +	 * before returning VM_FAULT_RETRY only if
> FAULT_FLAG_RETRY_NOWAIT is
> > +	 * not set.
> 
> Well, iiuc FAULT_FLAG_ALLOW_RETRY means that ->fault() _can_ drop
> mmap_sem and return VM_FAULT_RETRY (unless NOWAIT).

That comment is just copied from elsewhere in the code.
My interpretation was that the fault handler _should_ return with mmap_sem
held or not depending on FAULT_FLAG_RETRY_NOWAIT.

> > +	if (vmf->flags & (FAULT_FLAG_ALLOW_RETRY |
> FAULT_FLAG_TRIED)) {
> > +		if (!(vmf->flags & FAULT_FLAG_KILLABLE))
> > +			if (current && fatal_signal_pending(current)) {
> > +				up_read(&current->mm->mmap_sem);
> > +				return VM_FAULT_RETRY;
> 
> I fail to understand this. But in any case, you do not need to check current !=
> NULL and up_read() looks wrong if RETRY_NOWAIT?

My bad. I didn't need to check current != NULL. 
There was the case when the fault was invoked from a kthread in KVM's async_pf
and I confused current with current->mm.

Mircea