Re: [PATCH v2 5/7] mm/mremap: complete refactor of move_vma()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 10, 2025 at 04:11:59PM +0100, Vlastimil Babka wrote:
> On 3/6/25 11:34, Lorenzo Stoakes wrote:
> > We invoke ksm_madvise() with an intentionally dummy flags field, so no
> > need to pass around.
> >
> > Additionally, the code tries to be 'clever' with account_start,
> > account_end, using these to both check that vma->vm_start != 0 and that we
> > ought to account the newly split portion of VMA post-move, either before
> > or after it.
> >
> > We need to do this because we intentionally removed VM_ACCOUNT on the VMA
> > prior to unmapping, so we don't erroneously unaccount memory (we have
> > already calculated the correct amount to account and accounted it, any
> > subsequent subtraction will be incorrect).
> >
> > This patch significantly expands the comment (from 2002!) about
> > 'concealing' the flag to make it abundantly clear what's going on, as well
> > as adding and expanding a number of other comments also.
> >
> > We can remove account_start, account_end by instead tracking when we
> > account (i.e.  vma->vm_flags has the VM_ACCOUNT flag set, and this is not
> > an MREMAP_DONTUNMAP operation), and figuring out when to reinstate the
> > VM_ACCOUNT flag on prior/subsequent VMAs separately.
> >
> > We additionally break the function into logical pieces and attack the very
> > confusing error handling logic (where, for instance, new_addr is set to
> > err).
> >
> > After this change the code is considerably more readable and easy to
> > manipulate.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
>
> Reviewed-by: Vlastimil Babka <vbabka@xxxxxxx>

Thanks!

>
> Some nits below:
>
> > +/*
> > + * Unmap source VMA for VMA move, turning it from a copy to a move, being
> > + * careful to ensure we do not underflow memory account while doing so if an
> > + * accountable move.
> > + *
> > + * This is best effort, if we fail to unmap then we simply try
>
> this looks like an unfinished sentence?

Damn yeah, will fix. Will wait for rest of your review before either
fix-patching or respinning.

>
> > @@ -1007,51 +1157,15 @@ static unsigned long move_vma(struct vma_remap_struct *vrm)
> >  	 */
> >  	hiwater_vm = mm->hiwater_vm;
>
> This...
>
> > -	/* Tell pfnmap has moved from this vma */
> > -	if (unlikely(vma->vm_flags & VM_PFNMAP))
> > -		untrack_pfn_clear(vma);
> > -
> > -	if (unlikely(!err && (vrm->flags & MREMAP_DONTUNMAP))) {
> > -		/* We always clear VM_LOCKED[ONFAULT] on the old vma */
> > -		vm_flags_clear(vma, VM_LOCKED_MASK);
> > -
> > -		/*
> > -		 * anon_vma links of the old vma is no longer needed after its page
> > -		 * table has been moved.
> > -		 */
> > -		if (new_vma != vma && vma->vm_start == old_addr &&
> > -			vma->vm_end == (old_addr + old_len))
> > -			unlink_anon_vmas(vma);
> > -
> > -		/* Because we won't unmap we don't need to touch locked_vm */
> > -		vrm_stat_account(vrm, new_len);
> > -		return new_addr;
> > -	}
> > -
> > -	vrm_stat_account(vrm, new_len);
> > -
> > -	vma_iter_init(&vmi, mm, old_addr);
> > -	if (do_vmi_munmap(&vmi, mm, old_addr, old_len, vrm->uf_unmap, false) < 0) {
> > -		/* OOM: unable to split vma, just get accounts right */
> > -		if (vm_flags & VM_ACCOUNT && !(vrm->flags & MREMAP_DONTUNMAP))
> > -			vm_acct_memory(old_len >> PAGE_SHIFT);
> > -		account_start = account_end = false;
> > -	}
> > +	vrm_stat_account(vrm, vrm->new_len);
> > +	if (unlikely(!err && (vrm->flags & MREMAP_DONTUNMAP)))
> > +		dontunmap_complete(vrm, new_vma);
> > +	else
> > +		unmap_source_vma(vrm);
> >
> >  	mm->hiwater_vm = hiwater_vm;
>
> ... and this AFAICS only applies to the unmap_source_vma() case. And from
> the comment it seems only to the "undo destination vma" variant. BTW this
> also means the unmap_source_vma() name is rather misleading?

I agree it's only meaningful in that case, I am not sure what you mean
about the undo destination vma case?

We have 3 situations in which we call unmap_source_vma():

1. normal, no error move.
2. error has occured, so we set the source to destination and vice-versa.
3. MREMAP_DONTUNMAP, error has occured, so we set the source to destination
   and vice-versa.

I _hate_ this error handling where we reverse the source/destination, it's
so confusing, but since we are doing that, I still think unmap_source_vma()
is meaningful.

unmap_old_vma() is possible too, but I think less clear...

The beter fix I think is to find a less awful way to do the error handling.

>
> So I think the whole handling of that hiwater_vm could move to
> unmap_source_vma(). It should not matter if we take the snapshot of
> hiwater_vm only after vrm_stat_account() bumps the total_vm. Nobody else can
> be racing with us to permanently turn that total_vm to a hiwater_vm.
>
> (this is probably a potential cleanup for a followup-patch anyway)

Yup agreed, but let's do it as a follow up as this is how it is in the
original code.

Will add to the whiteboard TODO...

>
> >
> > -	/* Restore VM_ACCOUNT if one or two pieces of vma left */
> > -	if (account_start) {
> > -		vma = vma_prev(&vmi);
> > -		vm_flags_set(vma, VM_ACCOUNT);
> > -	}
> > -
> > -	if (account_end) {
> > -		vma = vma_next(&vmi);
> > -		vm_flags_set(vma, VM_ACCOUNT);
> > -	}
> > -
> > -	return new_addr;
> > +	return err ? (unsigned long)err : vrm->new_addr;
> >  }
> >
> >  /*
> > --
> > 2.48.1
>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux