Hi Charan, On Thu, Jul 14, 2022 at 08:17:43PM +0530, Charan Teja Kalla wrote: > The below is one path where race between page_ext and offline of the > respective memory blocks will cause use-after-free on the access of > page_ext structure. > > process1 process2 > --------- --------- > a)doing /proc/page_owner doing memory offline > through offline_pages. > > b)PageBuddy check is failed > thus proceed to get the > page_owner information > through page_ext access. > page_ext = lookup_page_ext(page); > > migrate_pages(); > ................. > Since all pages are successfully > migrated as part of the offline > operation,send MEM_OFFLINE notification > where for page_ext it calls: > offline_page_ext()--> > __free_page_ext()--> > free_page_ext()--> > vfree(ms->page_ext) > mem_section->page_ext = NULL > > c) Check for the PAGE_EXT flags > in the page_ext->flags access > results into the use-after-free(leading > to the translation faults). > > As mentioned above, there is really no synchronization between page_ext > access and its freeing in the memory_offline. > > The memory offline steps(roughly) on a memory block is as below: > 1) Isolate all the pages > 2) while(1) > try free the pages to buddy.(->free_list[MIGRATE_ISOLATE]) > 3) delete the pages from this buddy list. > 4) Then free page_ext.(Note: The struct page is still alive as it is > freed only during hot remove of the memory which frees the memmap, which > steps the user might not perform). > > This design leads to the state where struct page is alive but the struct > page_ext is freed, where the later is ideally part of the former which > just representing the page_flags. This seems to be a wrong design where > 'struct page' as a whole is not accessible(Thanks to Minchan for > pointing this out). > > The above mentioned race is just one example __but the problem persists > in the other paths too involving page_ext->flags access(eg: > page_is_idle())__. Since offline waits till the last reference on the > page goes down i.e. any path that took the refcount on the page can make > the memory offline operation to wait. Eg: In the migrate_pages() > operation, we do take the extra refcount on the pages that are under > migration and then we do copy page_owner by accessing page_ext. For > > Fix those paths where offline races with page_ext access by maintaining > synchronization with rcu lock. > > Thanks to David Hildenbrand for his views/suggestions on the initial > discussion[1]. > > [1] https://lore.kernel.org/linux-mm/59edde13-4167-8550-86f0-11fc67882107@xxxxxxxxxxx/ > > Signed-off-by: Charan Teja Kalla <quic_charante@xxxxxxxxxxx> > --- > include/linux/page_ext.h | 19 +++++++++++++++++++ > include/linux/page_idle.h | 40 ++++++++++++++++++++++++++++++---------- > mm/page_ext.c | 3 ++- > mm/page_owner.c | 18 +++++++++++++++--- > mm/page_table_check.c | 10 +++++++--- > 5 files changed, 73 insertions(+), 17 deletions(-) > > diff --git a/include/linux/page_ext.h b/include/linux/page_ext.h > index fabb2e1..df5d353 100644 > --- a/include/linux/page_ext.h > +++ b/include/linux/page_ext.h > @@ -64,6 +64,25 @@ static inline struct page_ext *page_ext_next(struct page_ext *curr) > return next; > } > > +static inline struct page_ext *get_page_ext(struct page *page) > +{ > + struct page_ext *page_ext; > + > + rcu_read_lock(); > + page_ext = lookup_page_ext(page); > + if (!page_ext) { > + rcu_read_unlock(); > + return NULL; > + } > + > + return page_ext; > +} > + > +static inline void put_page_ext(void) > +{ > + rcu_read_unlock(); > +} > + Would it be a harm if we make lookup_page_ext() completely a private function? Or is there any codepath that have the benefit of calling lookup_page_ext() without going through get_page_ext()? If that is the case, we should add RCU lockdep check inside lookup_page_ext() to make sure that this function is called with RCUs. Thanks, Pavan