On 05/26/23 at 08:13am, Lorenzo Stoakes wrote: > On Fri, May 26, 2023 at 08:08:33AM +0800, Baoquan He wrote: > > On 05/24/23 at 09:24am, Lorenzo Stoakes wrote: > > > It turns out that alloc_pages_bulk_array() does not treat the page_array > > > parameter as an output parameter, but rather reads the array and skips any > > > entries that have already been allocated. > > > > I read __alloc_pages_bulk() carefully, it does store the allocated page > > pointers into page_array[] and pass out, not just reads the array and > > skips entry alreay allocated. > > Umm, the function literally opens with:- > > /* > * Skip populated array elements to determine if any pages need > * to be allocated before disabling IRQs. > */ > while (page_array && nr_populated < nr_pages && page_array[nr_populated]) > nr_populated++; OK, suppose page_array[] alreasy has three pages populated, if not initialized and there's garbage data in page_array[], it could have nr_populated > 3 finally? This is really risky. > > And then later:- > > /* Skip existing pages */ > if (page_array && page_array[nr_populated]) { > nr_populated++; > continue; > } This is interesting, I thought this place of nr_populated checking and updating is meaningless, in fact it's skipping the element with vlaue in the middle of page_array. I realize this when I recheck the code when replying to your mail. Not sure if we should restrict that, or it's really existing reasonablly. [x][x][x][][][][x][x][][] x marks the element pointing to page. > > This explicitly skips populated array entries and reads page_array to see > if entries already exist, and literally documents this in the comments > above each line, exactly as I describe. OK, I misread your words in log. While page_array[] is still output parameter, just not pure output parameter? Not sure if I understand output parameter correctly. > > > > > For the issue this patch is trying to fix, you mean __alloc_pages_bulk() > > doesn't initialize page_array intentionally if it doesn't successfully > > allocate desired number of pages. we may need add one sentence to notice > > user that page_array need be initialized explicitly. > > It isn't 'trying' to fix it, it fixes it. I have this reproing locally. Right, my wrong expression. > > What you're stating about 'successfully allocate desired number of pages' > is irrelevant, we literally check the number of allocated pages in the > caller. > > No sentences need to be added, I explicitly state that the issue is due to > the array being uninitialised, the summary lines talks about reading > garbage. Well, I meant adding sentence above __alloc_pages_bulk() to tell: page_array[] could have garbage data stored if you don't initialize it explicitly before calling __alloc_pages_bulk(); This could happen in other place if they don't use kcalloc(), kmalloc(GFP_ZERO) or something like this to allocate page_array[]? > > > > > By the way, could you please tell in which line the test was referencing > > uninitialized data and causing the PFN to not be valid and trigger the > > WANR_ON? Please forgive my dumb head. > > Well, I showed you the lines above where __alloc_bulk_array() is accessing > uninitialised data by reading page_array[]. I see now, thanks for these details. > > But ultimately this is called from vm_map_ram_test() in lib/test_vmalloc.c:- > > for (i = 0; i < test_loop_count; i++) { > v_ptr = vm_map_ram(pages, map_nr_pages, NUMA_NO_NODE); > ^--- triggers warning because we can't map the invalid PFN > *v_ptr = 'a'; > ^--- NULL pointer deref > vm_unmap_ram(v_ptr, map_nr_pages); > } > > The warning is triggered in:- > > vm_map_ram() > vmap_pages_range() > vmap_pages_range_noflush() > __vmap_pages_range_noflush() > vmap_pages_p4d_range() > vmap_pages_pud_range() > vmap_pages_pmd_range() > vmap_pages_pte_range() > > In:- > > if (WARN_ON(!pfn_valid(page_to_pfn(page)))) > return -EINVAL; > > The PFN is invalid because I happen to have garbage in an array entry such > that page_to_pfn(garbage) >= max_pfn. > > > > > > > This is somewhat unexpected and breaks this test, as we allocate the pages > > > array uninitialised on the assumption it will be overwritten. > > > > > > As a result, the test was referencing uninitialised data and causing the > > > PFN to not be valid and thus a WARN_ON() followed by a null pointer deref > > > and panic. > > > > > > In addition, this is an array of pointers not of struct page objects, so we > > > need only allocate an array with elements of pointer size. > > > > > > We solve both problems by simply using kcalloc() and referencing > > > sizeof(struct page *) rather than sizeof(struct page). > > > > > > Signed-off-by: Lorenzo Stoakes <lstoakes@xxxxxxxxx> > > > --- > > > lib/test_vmalloc.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/lib/test_vmalloc.c b/lib/test_vmalloc.c > > > index 9dd9745d365f..3718d9886407 100644 > > > --- a/lib/test_vmalloc.c > > > +++ b/lib/test_vmalloc.c > > > @@ -369,7 +369,7 @@ vm_map_ram_test(void) > > > int i; > > > > > > map_nr_pages = nr_pages > 0 ? nr_pages:1; > > > - pages = kmalloc(map_nr_pages * sizeof(struct page), GFP_KERNEL); > > > + pages = kcalloc(map_nr_pages, sizeof(struct page *), GFP_KERNEL); > > > if (!pages) > > > return -1; > > > > > > -- > > > 2.40.1 > > > > > > > A broader problem we might want to think about is how little anybody is > running this test in order that it wasn't picked up before now... obviously > there's an element of luck as to whether the page_array happens to be > zeroed or not, but you'd think it'd be garbage filled at least a reasonable > amount of the time. Hmm, that's why we may need notice people that there's risk in __alloc_pages_bulk() if page_array[] is not initialized and the garbage could be mistaken as a effective page pointer. My personal opinion. People may argue it's caller's responsibility to do that. Thanks Baoquan