On Fri 29-07-22 21:17:44, Charan Teja Kalla wrote: > Thanks Michal for the reviews!! > > On 7/28/2022 8:07 PM, Michal Hocko wrote: > >> FAQ's: > >> Q) Should page_ext_[get|put]() needs to be used for every page_ext > >> access? > >> A) NO, the synchronization is really not needed in all the paths of > >> accessing page_ext. One case is where extra refcount is taken on a > >> page for which memory block, this pages falls into, offline operation is > >> being performed. This extra refcount makes the offline operation not to > >> succeed hence the freeing of page_ext. Another case is where the page > >> is already being freed and we do reset its page_owner. > > This is just subtlety and something that can get misunderstood over > > time. Moreover there is no documentation explaining the difference. > > What is the reason to have these two different APIs in the first place. > > RCU read side is almost zero cost. So what is the point? > Currently not all the places where page_ext is being used is put under > the rcu_lock. I just used rcu lock in the places where it is possible to > have the use-after-free of page_ext. You recommend to use rcu lock while > using with page_ext in all the places? Yes. Using locking inconsistently just begs for future problems. There should be a very good reason to use lockless approach in some paths and that would be where the locking overhead is not really acceptable or when the locking cannot be used for other reasons. RCU read lock is essentially zero overhead so the only reason would be that the critical section would require to sleep. Is any of that the case? If there is a real need to have a lockless variant then I would propose to add __page_ext_get/put which would be lockless and clearly documented under which contexts it can be used and enfore those condictions (e.g. reference count assumption). > My only point here is since there may be a non-atomic context exist > across page_ext_get/put() and If users are sure that this page's > page_ext will not be freed by parallel offline operation, they need not > get the rcu lock. Existing users are probably easy to check but think about the future. Most developers (even a large part of the MM community) is not deeply familiar with the memory hotplug. Not to mention people do not tend to follow development in that area and assumptions might change. [...] > >> @@ -298,9 +300,26 @@ static void __free_page_ext(unsigned long pfn) > >> ms = __pfn_to_section(pfn); > >> if (!ms || !ms->page_ext) > >> return; > >> - base = get_entry(ms->page_ext, pfn); > >> + > >> + base = READ_ONCE(ms->page_ext); > >> + if (page_ext_invalid(base)) > >> + base = (void *)base - PAGE_EXT_INVALID; > > All page_ext accesses should use the same fetched pointer including the > > ms->page_ext check. Also page_ext_invalid _must_ be true here otherwise > > something bad is going on so I would go with > > if (WARN_ON_ONCE(!page_ext_invalid(base))) > > return; > > base = (void *)base - PAGE_EXT_INVALID; > > The roll back operation in the online_page_ext(), where we free the > allocated page_ext's, will not have the PAGE_EXT_INVALID flag thus > WARN() may not work here. no? Wouldn't ms->page_ext be NULL in that case? -- Michal Hocko SUSE Labs