On Wed, May 17, 2023 at 4:30 PM Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote: > > On 05/17/23 16:09, Jiaqi Yan wrote: > > Today when hardware memory is corrupted in a hugetlb hugepage, > > kernel leaves the hugepage in pagecache [1]; otherwise future mmap or > > read will suject to silent data corruption. This is implemented by > > returning -EIO from hugetlb_read_iter immediately if the hugepage has > > HWPOISON flag set. > > > > Since memory_failure already tracks the raw HWPOISON subpages in a > > hugepage, a natural improvement is possible: if userspace only asks for > > healthy subpages in the pagecache, kernel can return these data. > > Thanks for putting this together. > > I recall discussing this some time back, and deciding to wait and see > how HGM would progress. Since it may be some time before HGM goes > upstream, it would be reasonable to consider this again. This improvement actually does NOT depend on HGM at all. No page table related stuff involved here. The other RFC [2] I sent earlier DOES require HGM. This improvement was brought up by James when we were working on [2]. In "Future Work" section of the cover letter, I thought HGM was needed but soon when I code it up, I found I was wrong. > > One quick question. > Do you have an actual use case for this? It certainly is an improvement > over existing functionality. However, I am not aware of too many (?any?) > users actually doing read() calls on hugetlb files. I don't have any use case. I did search on Github for around half a hour and all the hugetlb usages are done via mmap. > -- > Mike Kravetz > > > This patchset implements this improvement. It consist of three parts. > > The 1st commit exports the functionality to tell if a subpage inside a > > hugetlb hugepage is a raw HWPOISON page. The 2nd commit teaches > > hugetlbfs_read_iter to return as many healthy bytes as possible. > > The 3rd commit properly tests this new feature. > > > > [1] commit 8625147cafaa ("hugetlbfs: don't delete error page from pagecache") [2] https://lore.kernel.org/linux-mm/20230428004139.2899856-6-jiaqiyan@xxxxxxxxxx/T/#m97c6edef8ad0cc9b064e1fd9369b8521dcfa43de > > > > Jiaqi Yan (3): > > mm/hwpoison: find subpage in hugetlb HWPOISON list > > hugetlbfs: improve read HWPOISON hugepage > > selftests/mm: add tests for HWPOISON hugetlbfs read > > > > fs/hugetlbfs/inode.c | 62 +++- > > include/linux/mm.h | 23 ++ > > mm/memory-failure.c | 26 +- > > tools/testing/selftests/mm/.gitignore | 1 + > > tools/testing/selftests/mm/Makefile | 1 + > > .../selftests/mm/hugetlb-read-hwpoison.c | 322 ++++++++++++++++++ > > 6 files changed, 419 insertions(+), 16 deletions(-) > > create mode 100644 tools/testing/selftests/mm/hugetlb-read-hwpoison.c > > > > -- > > 2.40.1.606.ga4b1b128d6-goog > > (Sorry if you received twice, was sent in a wrong way a while ago)