a.k.a "Fix rare user data corruption when using THP" :) On Mon, 14 Aug 2023, Matthew Wilcox (Oracle) wrote: > The special casing was originally added in pre-git history; reproducing > the commit log here: > > > commit a318a92567d77 > > Author: Andrew Morton <akpm@xxxxxxxx> > > Date: Sun Sep 21 01:42:22 2003 -0700 > > > > [PATCH] Speed up direct-io hugetlbpage handling > > > > This patch short-circuits all the direct-io page dirtying logic for > > higher-order pages. Without this, we pointlessly bounce BIOs up to > > keventd all the time. > > In the last twenty years, compound pages have become used for more than > just hugetlb. Rewrite these functions to operate on folios instead > of pages and remove the special case for hugetlbfs; I don't think > it's needed any more (and if it is, we can put it back in as a call > to folio_test_hugetlb()). > > This was found by inspection; as far as I can tell, this bug can lead > to pages used as the destination of a direct I/O read not being marked > as dirty. If those pages are then reclaimed by the MM without being > dirtied for some other reason, they won't be written out. Then when > they're faulted back in, they will not contain the data they should. > It'll take a pretty unusual setup to produce this problem with several > races all going the wrong way. > > This problem predates the folio work; it could for example have been > triggered by mmaping a THP in tmpfs and using that as the target of an > O_DIRECT read. > > Fixes: 800d8c63b2e98 ("shmem: add huge pages support") No. It's a good catch, but bug looks specific to the folio work to me. Almost all shmem pages are dirty from birth, even as soon as they are brought back from swap; so it is not necessary to re-mark them dirty. The exceptions are pages allocated to holes when faulted: so you did get me worried as to whether khugepaged could collapse a pmd-ful of those into a THP without marking the result as dirty. But no, in v6.5-rc6 the collapse_file() success path has if (is_shmem) folio_mark_dirty(folio); and in v5.10 the same appears as if (is_shmem) set_page_dirty(new_page); (IIRC, that or marking pmd dirty was missed from early shmem THP support, but fairly soon corrected, and backported to stable then. I have a faint memory of versions which assembled pmd_dirty from collected pte_dirtys.) And the !is_shmem case is for CONFIG_READ_ONLY_THP_FOR_FS: writing into those pages, by direct IO or whatever, is already prohibited. It's dem dirty (or not dirty) folios dat's the trouble! Hugh > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx> > --- > block/bio.c | 46 ++++++++++++++++++++++++---------------------- > 1 file changed, 24 insertions(+), 22 deletions(-)