Re: [PATCH] mm/mmap: convert all mas except mas_detach to vma iterator

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2024/2/21 22:31, Liam R. Howlett wrote:
* Yajun Deng <yajun.deng@xxxxxxxxx> [240220 22:26]:
On 2024/2/21 02:06, Liam R. Howlett wrote:
* Yajun Deng <yajun.deng@xxxxxxxxx> [240218 21:30]:
Cc:  Vlastimil, Lorenzo,Suren

On 2024/2/18 10:31, Yajun Deng wrote:
There are two types of iterators mas and vmi in the current code. If the
maple tree comes from the mm struct, we can use vma iterator. Avoid using
mas directly.
Thanks for looking at this.

I had left the maple state exposed in the mmap.c file because it does a
number of operations that no one else does, so the new functions will be
called a very limited number of times (as low as once).

I think this is a worth while change since this may be needed in the
future for dealing with more specialised uses of the tree.  It also
removes the 0/ULONG_MAX limits from certain calls, and the vma iterator
names help explain things.

I don't know why you treat the low/high search differently than the
mas_reset() and mas_*_range(). In any case, the comment is inaccurate
when mas_ functions are called with &vmi.mas in places.


Because the mas_reset() and mas_*_range() only covert mas to vmi. It's
simple.

But the low/high also covert max to max - 1. It's a little more complex.
Ah, so the code doesn't match the comment, since the code will still use
mas directly in this version.  This was, perhaps, the largest issue with
the patch.  Having a good patch log is very important as people rely on
it during reviews, but more importantly when tracking down an issue
later on.

I like the idea of removing as many mas uses as feasible, but we will
still have a few that must be passed through, so please change the
wording.


Okay.

Leave the mt_detach tree keep using mas, as it doesn't come from the mm
struct.
Well, it's still VMAs from the mm struct.  I agree that we should not
change this for now.

Convert all mas except mas_detach to vma iterator. And introduce
vma_iter_area_{lowest, highest} helper functions for use vma interator.
Do you mean mas functions?  You do pass the maple state through to other
areas.  ie: free_pgtables().

Yes.
...
    retry:
-	if (mas_empty_area_rev(&mas, low_limit, high_limit - 1, length))
+	if (vma_iter_area_highest(&vmi, low_limit, high_limit, length))
    		return -ENOMEM;
-	gap = mas.last + 1 - info->length;
+	gap = vma_iter_end(&vmi) - info->length;
    	gap -= (gap - info->align_offset) & info->align_mask;
-	gap_end = mas.last;
-	tmp = mas_next(&mas, ULONG_MAX);
+	gap_end = vma_iter_end(&vmi);
vma_iter_end will return vmi->mas.last + 1, is what you have here
correct?

Yes, the following changes 'if (vm_start_gap(tmp) <= gap_end)' to 'if
(vm_start_gap(tmp) < gap_end)'.

+	tmp = vma_next(&vmi);
    	if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */
-		if (vm_start_gap(tmp) <= gap_end) {
+		if (vm_start_gap(tmp) < gap_end) {
Thanks.  This works and the variable isn't used again.

...
@@ -1959,11 +1958,12 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
    	struct vm_area_struct *next;
    	unsigned long gap_addr;
    	int error = 0;
-	MA_STATE(mas, &mm->mm_mt, vma->vm_start, address);
+	VMA_ITERATOR(vmi, mm, 0);
    	if (!(vma->vm_flags & VM_GROWSUP))
    		return -EFAULT;
+	vma_iter_config(&vmi, vma->vm_start, address);
This is confusing.  I think you are doing this so that the vma iterator
is set up the same as the maple state, and not what is logically
necessary?

Yes, VMA_ITERATOR can only pass one address.

    	/* Guard against exceeding limits of the address space. */
    	address &= PAGE_MASK;
    	if (address >= (TASK_SIZE & PAGE_MASK))
@@ -1985,15 +1985,15 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address)
    	}
    	if (next)
-		mas_prev_range(&mas, address);
+		mas_prev_range(&vmi.mas, address);
This isn't really hiding the maple state.

Okay,  I will create a new helper function for this in the mm/internal.h.


-	__mas_set_range(&mas, vma->vm_start, address - 1);
-	if (mas_preallocate(&mas, vma, GFP_KERNEL))
+	vma_iter_config(&vmi, vma->vm_start, address);
The above maple state changes is to get the maple state to point to the
correct area for the preallocation call below.  This seems unnecessary
to me.

We really should just set it up correctly.  Unfortunately, with the VMA
iterator, that's not really possible on initialization.

What we can do is use the vma->vm_start for the initialization, then use
vma_iter_config() here.  That will not reset any state - but that's fine
because the preallocation is the first call that actually uses it
anyways.

So we can initialize with vma->vm_start, don't call vma_iter_config
until here, and also drop the if (next) part.

This is possible here because it's not optimised like the
expand_upwards() case, which uses the state to check prev and avoids an
extra walk.

Please make sure to test with the ltp tests on the stack combining, etc
on a platform that expands down.


It seems something wrong about this description. This change is in expand_upwards(), but not in

expand_downwards(). So we should test it on a platform that expands up. And drop the if (next) part

is unnecessary. Did I get that right?


Okay, I will test it.
Testing this can be tricky.  Thanks for looking at it.

...
    	mmap_write_lock(mm);
    	mt_clear_in_rcu(&mm->mm_mt);
-	mas_set(&mas, vma->vm_end);
-	free_pgtables(&tlb, &mas, vma, FIRST_USER_ADDRESS,
+	vma_iter_set(&vmi, vma->vm_end);
+	free_pgtables(&tlb, &vmi.mas, vma, FIRST_USER_ADDRESS,
    		      USER_PGTABLES_CEILING, true);
I guess the page tables still deal with the maple state directly then.

Yes.
That's okay, we can leave that for another time.  I believe Peng
complicated the internals of free_pgtables() with his forking
optimisation so care will need to be taken and should probably be done
in another patch, another time.  In fact, hiding xa_is_zero() within the
vma iterator is going to be a really good thing to do, if performance
doesn't suffer.

Just don't state we are removing the use of maple state in the change
log - since we do pass it through sometimes.

...

Thanks again,
Liam




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux