Re: [PATCH 39/41] kernel/fork: throttle call_rcu() calls in vm_area_free

Suren Baghdasaryan <surenb@xxxxxxxxxx> · Tue, 17 Jan 2023 17:19:46 -0800

On Tue, Jan 17, 2023 at 7:57 AM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Mon 09-01-23 12:53:34, Suren Baghdasaryan wrote:
> > call_rcu() can take a long time when callback offloading is enabled.
> > Its use in the vm_area_free can cause regressions in the exit path when
> > multiple VMAs are being freed.
>
> What kind of regressions.
>
> > To minimize that impact, place VMAs into
> > a list and free them in groups using one call_rcu() call per group.
>
> Please add some data to justify this additional complexity.

Sorry, should have done that in the first place. A 4.3% regression was
noticed when running execl test from unixbench suite. spawn test also
showed 1.6% regression. Profiling revealed that vma freeing was taking
longer due to call_rcu() which is slow when RCU callback offloading is
enabled. I asked Paul McKenney and he explained to me that because the
callbacks are offloaded to some other kthread, possibly running on
some other CPU, it is necessary to use explicit locking.  Locking on a
per-call_rcu() basis would result in excessive contention during
callback flooding. So, by batching call_rcu() work we cut that
overhead and reduce this lock contention.

> --
> Michal Hocko
> SUSE Labs