Re: [RFC PATCH v2 4/4] mm/madvise: remove redundant mmap_lock operations from process_madvise()

SeongJae Park <sj@xxxxxxxxxx> · Wed, 5 Feb 2025 22:28:01 -0800

On Tue,  4 Feb 2025 11:53:43 -0800 SeongJae Park <sj@xxxxxxxxxx> wrote:

> On Fri, 31 Jan 2025 17:51:45 +0000 Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx> wrote:
> 
> > On Fri, Jan 31, 2025 at 12:47:24PM -0500, Liam R. Howlett wrote:
> > > * Davidlohr Bueso <dave@xxxxxxxxxxxx> [250131 12:31]:
> > > > On Fri, 31 Jan 2025, Lorenzo Stoakes wrote:
> > > >
> > > > > On Thu, Jan 16, 2025 at 05:30:58PM -0800, SeongJae Park wrote:
> > > > > > Optimize redundant mmap lock operations from process_madvise() by
> > > > > > directly doing the mmap locking first, and then the remaining works for
> > > > > > all ranges in the loop.
> > > > > >
> > > > > > Signed-off-by: SeongJae Park <sj@xxxxxxxxxx>
> > > > >
> > > > > I wonder if this might increase lock contention because now all of the
> > > > > vector operations will hold the relevant mm lock without releasing after
> > > > > each operation?
[...]
> > 
> > Keep in mind process_madvise() is not limited by IOV_MAX, which can be rather
> > high, but rather UIO_FASTIOV, which is limited to 8 entries.
> 
> process_madvise() indeed have iovec array of UIO_FASTIOV size, namely iovstack.
> But, if I understood the code correctly, iostack is used only for a fast path
> that the user requested advicing less than UIO_FASTIOV regions.
> 
> I actually confirmed I can make the loop itrate 1024 times, using my
> microbenchmark[1].  My step for the check was running the program with
> 'eval_pmadv $((4*1024*1024)) $((4*1024)) $((4*1024*1024))' command, and
> counting the number of the itration of the vector_madvise() main loop using
> printk().  Please let me know if I'm missing something.
> 
> > 
> > (Some have been surprised by this limitation...!)
> > 
> > So I think at this point scaling isn't a huge issue, I raise it because in
> > future we may want to increase this limit, at which point we should think about
> > it, which is why I sort of hand-waved it away a bit.
> 
> I personally think this may still not be a huge issue, especially given the
> fact that users can test and tune the limit.  But I'd like to hear more
> opinions if available.

Maybe I should wait more for the more opinions, but I just impatiently sent the
next spin of this patch series[1].  Because Davidlohr and Lorenzo promised
their tags based on the assumption of UIO_FASTIOV limit while I think that may
not be an effective limit, I didn't add their tags on the fourth patch of the
series.  Please let me know if you have any concern about that, either on this
thread or on the new patch series thread[1].

[1] https://lore.kernel.org/20250206061517.2958-1-sj@xxxxxxxxxx

> 
> [1] https://github.com/sjp38/eval_proc_madvise/blob/master/eval_pmadv.c

Thanks,
SJ

[...]