On 27.06.22 18:09, Charan Teja Kalla wrote: > The below race between page_ext and online/offline of the respective > memory blocks will cause use-after-free on the access of page_ext structure. > > process1 process2 > --------- --------- > a)doing /proc/page_owner doing memory offline > through offline_pages > > b)PageBuddy check is failed > thus proceed to get the > page_owner information > through page_ext access. > page_ext = lookup_page_ext(page); > > migrate_pages(); > ................ > Since all pages are successfully > migrated as part of the offline > operation,send MEM_OFFLINE notification > where for page_ext it calls: > offline_page_ext()--> > __free_page_ext()--> > free_page_ext()--> > vfree(ms->page_ext) > mem_section->page_ext = NULL > > c) Check for the PAGE_EXT flags > in the page_ext->flags access > results into the use-after-free(leading > to the translation faults). > > As mentioned above, there is really no synchronization between page_ext > access and its freeing in the memory_offline. The above is just one > example but the problem persists in the other paths too involving > page_ext->flags access(eg: page_is_idle()). > > The memory offline steps(roughly) on a memory block is as below: > 1) Isolate all the pages > 2) while(1) > try free the pages to buddy.(->free_list[MIGRATE_ISOLATE]) > 3) delete the pages from this buddy list. > 4) Then free page_ext.(Note: The struct page is still alive as it is > freed only during hot remove of the memory which frees the memmap, which > steps the user might not perform). > > This design leads to the state where struct page is alive but the struct > page_ext is freed, where the later is ideally part of the former which > just representing the page_flags. This seems to be a wrong design where > 'struct page' as a whole is not accessible(Thanks to Minchan for > pointing this out). Accessing the struct page -- including any extensions -- is invalid if the memory section is marked offline. Usual PFN walkers use pfn_to_online_page() to make sure we have PFN with an actual meaning in it. There is no real synchronization between pfn_to_online_page() and memory offline code. For now it wasn't required because it was never relevant in practice. After pfn_to_online_page() it takes quite a long time until memory is actually offlined and then, the memmap is removed. Maybe it's different for page_ext. It smells like page_ext should use some mechanism during MEM_OFFLINE to synchronize against any users of its metadata. Generic memory offlining code might be the wrong place for that. > > Some solutions we think are: > ---------------------------- > 1) Take the mem_hotplug_lock read_lock every time page_ext access. That would be the big hammer. But it feels wrong, because page_ext is another subsystem that's synchronized from generic memory offlining code via the notifier. > > 2) Take the extra refcount on the page every time page_ext access is > made, so that parallel offline operation can't free the page to buddy. No, that's no good. Just racy. > > 3) Change the design where the page_ext is valid as long as the struct > page is alive. :/ Doesn't spark joy. > > Any other inputs here? page_ext needs a mechanism to synchronize against any users of the data it manages. Maybe RCU can help? -- Thanks, David / dhildenb