On Tue, 2018-10-23 at 10:30 -0700, Mike Kravetz wrote: > ..... snip.... > Here is updated patch without the drop_caches change and updated > fixes tag. > > From: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > > hugetlbfs: dirty pages as they are added to pagecache > > Some test systems were experiencing negative huge page reserve > counts and incorrect file block counts. This was traced to > /proc/sys/vm/drop_caches removing clean pages from hugetlbfs > file pagecaches. When non-hugetlbfs explicit code removes the > pages, the appropriate accounting is not performed. > > This can be recreated as follows: > fallocate -l 2M /dev/hugepages/foo > echo 1 > /proc/sys/vm/drop_caches > fallocate -l 2M /dev/hugepages/foo > grep -i huge /proc/meminfo > AnonHugePages: 0 kB > ShmemHugePages: 0 kB > HugePages_Total: 2048 > HugePages_Free: 2047 > HugePages_Rsvd: 18446744073709551615 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > Hugetlb: 4194304 kB > ls -lsh /dev/hugepages/foo > 4.0M -rw-r--r--. 1 root root 2.0M Oct 17 20:05 /dev/hugepages/foo > > To address this issue, dirty pages as they are added to pagecache. > This can easily be reproduced with fallocate as shown above. Read > faulted pages will eventually end up being marked dirty. But there > is a window where they are clean and could be impacted by code such > as drop_caches. So, just dirty them all as they are added to the > pagecache. > > Fixes: 6bda666a03f0 ("hugepages: fold find_or_alloc_pages into > huge_no_page()") > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > --- > mm/hugetlb.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 5c390f5a5207..7b5c0ad9a6bd 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -3690,6 +3690,12 @@ int huge_add_to_page_cache(struct page *page, > struct address_space *mapping, > return err; > ClearPagePrivate(page); > > + /* > + * set page dirty so that it will not be removed from > cache/file > + * by non-hugetlbfs specific code paths. > + */ > + set_page_dirty(page); > + > spin_lock(&inode->i_lock); > inode->i_blocks += blocks_per_huge_page(h); > spin_unlock(&inode->i_lock); This looks good. Reviewed-by: Khalid Aziz <khalid.aziz@xxxxxxxxxx> -- Khalid