On Tue, Sep 07, 2021 at 08:14:26PM -0700, Yang Shi wrote: ... > > > > > > > 5. We also could define a new FGP flag to return poisoned page, NULL > > > > > > > or error pointer. This also should need significant code change since > > > > > > > a lt callsites need to be contemplated. > > > > > > > > > > > > Could you explain a little more about which callers should use the flag? > > > > > > > > > > Just to solve the above invalidate/truncate problem and page fault > > > > > doesn't expect an error pointer. But it seems the above > > > > > invalidate/truncate paths don't matter. Page fault should be the only > > > > > user since page fault may need unlock the page if poisoned page is > > > > > returned. > > > > Orignally PG_hwpoison is detected in page fault handler for file-backed pages > > like below, > > > > static vm_fault_t __do_fault(struct vm_fault *vmf) > > ... > > ret = vma->vm_ops->fault(vmf); > > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY | > > VM_FAULT_DONE_COW))) > > return ret; > > > > if (unlikely(PageHWPoison(vmf->page))) { > > if (ret & VM_FAULT_LOCKED) > > unlock_page(vmf->page); > > put_page(vmf->page); > > vmf->page = NULL; > > return VM_FAULT_HWPOISON; > > } > > > > so it seems to me that if a filesystem switches to the new scheme of keeping > > error pages in page cache, ->fault() callback for the filesystem needs to be > > aware of hwpoisoned pages inside it and returns VM_FAULT_HWPOISON when it > > detects an error page (maybe by detecting pagecache_get_page() to return > > PTR_ERR(-EIO or -EHWPOISON) or some other ways). In the other filesystems > > with the old scheme, fault() callback does not return VM_FAULT_HWPOISON so > > that page fault handler falls back to the generic PageHWPoison check above. > > Yes, exactly. If we return ERR_PTR we need modify the above code > accordingly. But returning the page, no change is required. > > > > > > > > > > > It seems page fault check IS_ERR(page) then just return > > > > VM_FAULT_HWPOISON. But I found a couple of places in shmem which want > > > > to return head page then handle subpage or just return the page but > > > > don't care the content of the page. They should ignore hwpoison. So I > > > > guess we'd better to have a FGP flag for such cases. > > > > At least currently thp are supposed to be split before error handling. > > Not for file THP. > > > We could loosen this assumption to support hwpoison on a subpage of thp, > > but I'm not sure whether that has some benefit. > > I don't quite get your point. Currently the hwpoison flag is set on > specific subpage instead of head page. Sorry, I miss the case when thp split fails, then the thp has hwpoison flag set on any subpage. I only thought of the successful case, where the resulting error page is no longer a subpage. > > > And also this discussion makes me aware that we need to have a way to > > prevent hwpoisoned pages from being used to thp collapse operation. > > Actually not. The refcount from hwpoison could prevent it from > collapsing. But if we return ERR_PTR (#3) we need extra code in > khugepaged to handle it. OK, so we already prevent it. Thanks, Naoya Horiguchi > > So I realized #1 would require fewer changes. And leaving the page > state check to callers seem reasonable since the callers usually check > other states, e.g. uptodate.