> On Fri, May 14, 2021 at 01:45:43PM +0200, Uladzislau Rezki wrote: > > > > Seems like "zoneref" refers to invalid address. > > > > > > > > Thoughts? > > > > > > I have not previously read the patch but there are a few concerns and it's > > > probably just as well this blew up early. The bulk allocator assumes a > > > valid node but the patch can send in NUMA_NO_NODE (-1). > > > > > > > Should the bulk-allocator handle the NUMA_NO_NODE on its own? I mean instead > > of handling by user the allocator itself fixes it if NUMA_NO_NODE is passed. > > > > No for API similarity reasons. __alloc_pages_bulk is the API bulk > equivalent to __alloc_pages() and both expect valid node IDs. vmalloc > is using alloc_pages_node for high-order pages which first checks > the node ID so your options are to check it within vmalloc.c or add a > alloc_pages_node_bulk helper that is API equivalent to alloc_pages_node > as a prerequisite to your patch. > OK. Thanks. > > > On the high-order path alloc_pages_node is used which checks nid == NUMA_NO_NODE. > > > Also, area->pages is not necessarily initialised so that could be interpreted > > > as a partially populated array so minmally you need. > > > > > > > area->pages are zeroed, because __GFP_ZERO is sued during allocating an array. > > > > Ah, yes. > > > > However, the high-order path also looks suspicious. area->nr_pages is > > > advanced before the allocation attempt so in the event alloc_pages_node() > > > returns NULL prematurely, area->nr_pages does not reflect the number of > > > pages allocated so that needs examination. > > > > > <snip> > > for (area->nr_pages = 0; area->nr_pages < nr_small_pages; > > area->nr_pages += 1U << page_order) { > > <snip> > > > > if alloc_pages_node() fails we break the loop. area->nr_pages is initialized > > inside the for(...) loop, thus it will be zero if the single page allocator > > fails on a first iteration. > > > > Or i miss your point? > > > > At the time of the break, area->nr_pages += 1U << page_order happened > before the allocation failure happens. That looks very suspicious. > The "for" loop does not work that way. If you break the loop the "area->nr_pages += 1U << page_order" or an "increment" is not increased. It is increased only after the body of the "for" loop executes and it goes to next iteration. > > > As an aside, where or what is test_vmalloc.sh? It appears to have been > > > used a few times but it's not clear it's representative so are you aware > > > of workloads that are vmalloc-intensive? It does not matter for the > > > patch as such but it would be nice to know examples of vmalloc-intensive > > > workloads because I cannot recall a time during the last few years where > > > I saw vmalloc.c high in profiles. > > > > > test_vmalloc.sh is a shell script that is used for stressing and testing a > > vmalloc subsystem as well as performance evaluation. You can find it here: > > > > ./tools/testing/selftests/vm/test_vmalloc.sh > > > > Thanks. > > > As for workloads. Most of them which are critical to time and latency. For > > example audio/video, especially in the mobile area. I did a big rework of > > the KVA allocator because i found it not optimal to allocation time. > > > > Can you give an example benchmark that triggers it or is it somewhat > specific to mobile platforms with drivers that use vmalloc heavily? > See below an example of audio glitches. That was related to our phones and audio workloads: # Explanation is here wget ftp://vps418301.ovh.net/incoming/analysis_audio_glitches.txt # Audio 10 seconds sample is here. # The drop occurs at 00:09.295 you can hear it wget ftp://vps418301.ovh.net/incoming/tst_440_HZ_tmp_1.wav Apart of that a slow allocation can course two type of issues. First one is direct. When for example a high-priority RT thread does some allocation to bypass data to DSP. Long latency courses a delay of data to be passed to DSP. This is drivers area. Another example is when a task is doing an allocation and the RT task is placed onto a same CPU. In that case a long preemption-off(milliseconds) section can lead the RT task for starvation. For mobile devices it is UI stack where RT tasks are used. As a result we face frame drops. All such issues have been solved after a rework: wget ftp://vps418301.ovh.net/incoming/Reworking_of_KVA_allocator_in_Linux_kernel.pdf -- Vlad Rezki