Re: [PATCH 2/2] mm/vmalloc: rework the drain logic

Uladzislau Rezki <urezki@xxxxxxxxx> · Tue, 17 Nov 2020 14:04:34 +0100

On Tue, Nov 17, 2020 at 10:37:34AM +0800, huang ying wrote:
> On Tue, Nov 17, 2020 at 6:00 AM Uladzislau Rezki (Sony)
> <urezki@xxxxxxxxx> wrote:
> >
> > A current "lazy drain" model suffers from at least two issues.
> >
> > First one is related to the unsorted list of vmap areas, thus
> > in order to identify the [min:max] range of areas to be drained,
> > it requires a full list scan. What is a time consuming if the
> > list is too long.
> >
> > Second one and as a next step is about merging all fragments
> > with a free space. What is also a time consuming because it
> > has to iterate over entire list which holds outstanding lazy
> > areas.
> >
> > See below the "preemptirqsoff" tracer that illustrates a high
> > latency. It is ~24 676us. Our workloads like audio and video
> > are effected by such long latency:
> 
> This seems like a real problem.  But I found there's long latency
> avoidance mechanism in the loop in __purge_vmap_area_lazy() as
> follows,
> 
>         if (atomic_long_read(&vmap_lazy_nr) < resched_threshold)
>             cond_resched_lock(&free_vmap_area_lock);
> 
I have added that "resched threshold" because of on my tests i could
simply hit out of memory, due to the fact that a drain work is not up
to speed to process such long outstanding list of vmap areas.

>
> If it works properly, the latency problem can be solved.  Can you
> check whether this doesn't work for you?
>
We have that cond_resched_lock() in our products. The patch that is
in question creates bigger vmap areas on early step(merge them), so
the final structure becomes less fragmented, what speeds up a drain
logic, thus reduces a preemption off time.

Apart of that, high priority tasks like RT or DL which are users of
the vmalloc()/vfree() can start draining process from its contexts,
what is also a problem. In that sense, i think we need to make the
vfree() call to be asynchronous, so latency sensitive tasks and others
do not perform any draining from their contexts.

--
Vlad Rezki