On Fri, May 14, 2021 at 04:50:26PM +0200, Uladzislau Rezki wrote: > > > > However, the high-order path also looks suspicious. area->nr_pages is > > > > advanced before the allocation attempt so in the event alloc_pages_node() > > > > returns NULL prematurely, area->nr_pages does not reflect the number of > > > > pages allocated so that needs examination. > > > > > > > <snip> > > > for (area->nr_pages = 0; area->nr_pages < nr_small_pages; > > > area->nr_pages += 1U << page_order) { > > > <snip> > > > > > > if alloc_pages_node() fails we break the loop. area->nr_pages is initialized > > > inside the for(...) loop, thus it will be zero if the single page allocator > > > fails on a first iteration. > > > > > > Or i miss your point? > > > > > > > At the time of the break, area->nr_pages += 1U << page_order happened > > before the allocation failure happens. That looks very suspicious. > > > The "for" loop does not work that way. If you break the loop the > "area->nr_pages += 1U << page_order" or an "increment" is not increased. > It is increased only after the body of the "for" loop executes and it > goes to next iteration. > Yeah, I don't know what I was thinking -- not properly anyway. > > > As for workloads. Most of them which are critical to time and latency. For > > > example audio/video, especially in the mobile area. I did a big rework of > > > the KVA allocator because i found it not optimal to allocation time. > > > > > > > Can you give an example benchmark that triggers it or is it somewhat > > specific to mobile platforms with drivers that use vmalloc heavily? > > > > See below an example of audio glitches. That was related to our phones > and audio workloads: > > # Explanation is here > wget ftp://vps418301.ovh.net/incoming/analysis_audio_glitches.txt > > # Audio 10 seconds sample is here. > # The drop occurs at 00:09.295 you can hear it > wget ftp://vps418301.ovh.net/incoming/tst_440_HZ_tmp_1.wav > > Apart of that a slow allocation can course two type of issues. First one > is direct. When for example a high-priority RT thread does some allocation > to bypass data to DSP. Long latency courses a delay of data to be passed to > DSP. This is drivers area. > > Another example is when a task is doing an allocation and the RT task is > placed onto a same CPU. In that case a long preemption-off(milliseconds) > section can lead the RT task for starvation. For mobile devices it is UI > stack where RT tasks are used. As a result we face frame drops. > > All such issues have been solved after a rework: > > wget ftp://vps418301.ovh.net/incoming/Reworking_of_KVA_allocator_in_Linux_kernel.pdf > Thanks. That was enough for me to search to see what sort of general workload would be affected. Mostly it's driver specific. A lot of the users that would be potentially hot are already using kvmalloc so probably not worth the effort so test_vmalloc.sh makes sense. -- Mel Gorman SUSE Labs