On Wed, Oct 16, 2024 at 8:25 PM Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> wrote: > > > > On 2024/10/17 01:33, Yang Shi wrote: > > On Wed, Oct 16, 2024 at 8:38 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > >> > >> On Wed, Oct 16, 2024 at 06:09:30PM +0800, Baolin Wang wrote: > >>> @@ -3128,8 +3127,9 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) > >>> if (folio) { > >>> folio_unlock(folio); > >>> > >>> - page = folio_file_page(folio, index); > >>> - if (PageHWPoison(page)) { > >>> + if (folio_test_hwpoison(folio) || > >>> + (folio_test_large(folio) && > >>> + folio_test_has_hwpoisoned(folio))) { > >> > >> Hm, so if we have hwpoison set on one page in a folio, we now can't read > >> bytes from any page in the folio? That seems like we've made a bad > >> situation worse. > > > > Yeah, I agree. I think we can fallback to page copy if > > folio_test_has_hwpoisoned is true. The PG_hwpoison flag is per page. > > > > The folio_test_has_hwpoisoned is kept set if the folio split is failed > > in memory failure handler. > > Right. I can still keep the page size copy if > folio_test_has_hwpoisoned() is true. Some sample changes are as follow. > > Moreover, I noticed shmem splice_read() and write() also simply return > an error if the folio_test_has_hwpoisoned() is true, without any > fallback to page granularity. I wonder if it is worth adding page > granularity support as well? I think you should do the same. > > diff --git a/mm/shmem.c b/mm/shmem.c > index 7e79b6a96da0..f30e24e529b9 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -3111,9 +3111,11 @@ static ssize_t shmem_file_read_iter(struct kiocb > *iocb, struct iov_iter *to) > > for (;;) { > struct folio *folio = NULL; > + struct page *page = NULL; > unsigned long nr, ret; > loff_t end_offset, i_size = i_size_read(inode); > size_t fsize; > + bool fallback_page_copy = false; > > if (unlikely(iocb->ki_pos >= i_size)) > break; > @@ -3127,13 +3129,16 @@ static ssize_t shmem_file_read_iter(struct kiocb > *iocb, struct iov_iter *to) > if (folio) { > folio_unlock(folio); > > - if (folio_test_hwpoison(folio) || > - (folio_test_large(folio) && > - folio_test_has_hwpoisoned(folio))) { > + page = folio_file_page(folio, index); > + if (PageHWPoison(page)) { > folio_put(folio); > error = -EIO; > break; > } > + > + if (folio_test_large(folio) && > + folio_test_has_hwpoisoned(folio)) > + fallback_page_copy = true; > } > > /* > @@ -3147,7 +3152,7 @@ static ssize_t shmem_file_read_iter(struct kiocb > *iocb, struct iov_iter *to) > break; > } > end_offset = min_t(loff_t, i_size, iocb->ki_pos + > to->count); > - if (folio) > + if (folio && likely(!fallback_page_copy)) > fsize = folio_size(folio); > else > fsize = PAGE_SIZE; > @@ -3160,8 +3165,13 @@ static ssize_t shmem_file_read_iter(struct kiocb > *iocb, struct iov_iter *to) > * virtual addresses, take care about potential > aliasing > * before reading the page on the kernel side. > */ > - if (mapping_writably_mapped(mapping)) > - flush_dcache_folio(folio); > + if (mapping_writably_mapped(mapping)) { > + if (unlikely(fallback_page_copy)) > + flush_dcache_page(page); > + else > + flush_dcache_folio(folio); > + } > + > /* > * Mark the page accessed if we read the beginning. > */ > @@ -3171,7 +3181,10 @@ static ssize_t shmem_file_read_iter(struct kiocb > *iocb, struct iov_iter *to) > * Ok, we have the page, and it's up-to-date, so > * now we can copy it to user space... > */ > - ret = copy_folio_to_iter(folio, offset, nr, to); > + if (unlikely(fallback_page_copy)) > + ret = copy_page_to_iter(page, offset, > nr, to); > + else > + ret = copy_folio_to_iter(folio, offset, > nr, to); > folio_put(folio); > } else if (user_backed_iter(to)) { > /* The change seems fine to me.