On Thu 06-05-21 15:28:05, Aili Yao wrote: > On Thu, 6 May 2021 09:06:14 +0200 > Michal Hocko <mhocko@xxxxxxxx> wrote: > > > On Thu 06-05-21 08:56:11, Aili Yao wrote: > > > On Wed, 5 May 2021 15:27:39 +0200 > > > Michal Hocko <mhocko@xxxxxxxx> wrote: [...] > > > > I am not sure I follow. My point is that I fail to see any added value > > > > of the check as it doesn't prevent the race (it fundamentally cannot as > > > > the page can be poisoned at any time) but the failure path doesn't > > > > put_page which is incorrect even for hwpoison pages. > > > > > > Sorry, I have something to say: > > > > > > I have noticed the ref count leak in the previous topic ,but I don't think > > > it's a really matter. For memory recovery case for user pages, we will keep one > > > reference to the poison page so the error page will not be freed to buddy allocator. > > > which can be checked in memory_faulure() function. > > > > So what would happen if those pages are hwpoisoned from userspace rather > > than by HW. And repeatedly so? > > Sorry, I may be not totally understand what you mean. > > Do you mean hard page offline from mcelog? No I mean soft hwpoison from userspace (e.g. by MADV_HWPOISON but there are other interfaces AFAIK). And just to be explicit. All those interfaces are root only (CAP_SYS_ADMIN) so I am not really worried about any malitious abuse of the reference leak. I am mostly concerned that this is obviously broken without a good reason. The most trivial fix would have been to put_page in the return path but as I've mentioned in other email thread the fix really needs a deeper thought and consider other things. Hope that clarifies this some more. -- Michal Hocko SUSE Labs