On Thu 06-05-21 13:47:50, Aili Yao wrote: > On Wed, 5 May 2021 15:54:07 +0200 > Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > From: Michal Hocko <mhocko@xxxxxxxx> > > > > While reviewing http://lkml.kernel.org/r/20210429122519.15183-4-david@xxxxxxxxxx > > I have crossed d3378e86d182 ("mm/gup: check page posion status for > > coredump.") and noticed that this patch is broken in two ways. First it > > doesn't really prevent hwpoison pages from being dumped because hwpoison > > pages can be marked asynchornously at any time after the check. > > I rethink this: > There are two cases for this coredump panic issue. > One is the scenario that the hwpoison flag is set correctly, and the previous patch > will make it recoverable and avoid panic. > > Another is the hwpoison flag not valid in the check, maybe race condition. I don't think > this case is worth and reliazable to be covered. As the SRAR can happen freshly in the dump > process and thus can't be detected. > > And the previous patch doesn't make the Another case worse and unacceptable. just as it can't be > covered. > > So here is the patch: > For most case in this topic, the patch will work. For the case hwpoison flag not valid, it will > fallback to the original process before this patch --- just panic. Please propose a new fix which a) doesn't leak a page reference b) evaluates how realistic is the scenario c) explain why any other gup user doesn't really need to care - or in other words is the gup layer really suitable for this issue? -- Michal Hocko SUSE Labs