On Thu, Aug 25, 2022 at 04:32:58PM +0200, David Hildenbrand wrote: > commit f96f7a40874d7c746680c0b9f57cef2262ae551f upstream. > > Patch series "mm/hugetlb: fix write-fault handling for shared mappings", v2. > > I observed that hugetlb does not support/expect write-faults in shared > mappings that would have to map the R/O-mapped page writable -- and I > found two case where we could currently get such faults and would > erroneously map an anon page into a shared mapping. > > Reproducers part of the patches. > > I propose to backport both fixes to stable trees. The first fix needs a > small adjustment. > > This patch (of 2): > > Staring at hugetlb_wp(), one might wonder where all the logic for shared > mappings is when stumbling over a write-protected page in a shared > mapping. In fact, there is none, and so far we thought we could get away > with that because e.g., mprotect() should always do the right thing and > map all pages directly writable. > > Looks like we were wrong: > > -------------------------------------------------------------------------- > #include <stdio.h> > #include <stdlib.h> > #include <string.h> > #include <fcntl.h> > #include <unistd.h> > #include <errno.h> > #include <sys/mman.h> > > #define HUGETLB_SIZE (2 * 1024 * 1024u) > > static void clear_softdirty(void) > { > int fd = open("/proc/self/clear_refs", O_WRONLY); > const char *ctrl = "4"; > int ret; > > if (fd < 0) { > fprintf(stderr, "open(clear_refs) failed\n"); > exit(1); > } > ret = write(fd, ctrl, strlen(ctrl)); > if (ret != strlen(ctrl)) { > fprintf(stderr, "write(clear_refs) failed\n"); > exit(1); > } > close(fd); > } > > int main(int argc, char **argv) > { > char *map; > int fd; > > fd = open("/dev/hugepages/tmp", O_RDWR | O_CREAT); > if (!fd) { > fprintf(stderr, "open() failed\n"); > return -errno; > } > if (ftruncate(fd, HUGETLB_SIZE)) { > fprintf(stderr, "ftruncate() failed\n"); > return -errno; > } > > map = mmap(NULL, HUGETLB_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); > if (map == MAP_FAILED) { > fprintf(stderr, "mmap() failed\n"); > return -errno; > } > > *map = 0; > > if (mprotect(map, HUGETLB_SIZE, PROT_READ)) { > fprintf(stderr, "mmprotect() failed\n"); > return -errno; > } > > clear_softdirty(); > > if (mprotect(map, HUGETLB_SIZE, PROT_READ|PROT_WRITE)) { > fprintf(stderr, "mmprotect() failed\n"); > return -errno; > } > > *map = 0; > > return 0; > } > -------------------------------------------------------------------------- > > Above test fails with SIGBUS when there is only a single free hugetlb page. > # echo 1 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > # ./test > Bus error (core dumped) > > And worse, with sufficient free hugetlb pages it will map an anonymous page > into a shared mapping, for example, messing up accounting during unmap > and breaking MAP_SHARED semantics: > # echo 2 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > # ./test > # cat /proc/meminfo | grep HugePages_ > HugePages_Total: 2 > HugePages_Free: 1 > HugePages_Rsvd: 18446744073709551615 > HugePages_Surp: 0 > > Reason in this particular case is that vma_wants_writenotify() will > return "true", removing VM_SHARED in vma_set_page_prot() to map pages > write-protected. Let's teach vma_wants_writenotify() that hugetlb does not > support softdirty tracking. > > Link: https://lkml.kernel.org/r/20220811103435.188481-1-david@xxxxxxxxxx > Link: https://lkml.kernel.org/r/20220811103435.188481-2-david@xxxxxxxxxx > Fixes: 64e455079e1b ("mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared") > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx> > Cc: Peter Feiner <pfeiner@xxxxxxxxxx> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Cc: Cyrill Gorcunov <gorcunov@xxxxxxxxxx> > Cc: Pavel Emelyanov <xemul@xxxxxxxxxxxxx> > Cc: Jamie Liu <jamieliu@xxxxxxxxxx> > Cc: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Naoya Horiguchi <n-horiguchi@xxxxxxxxxxxxx> > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx> > Cc: Muchun Song <songmuchun@xxxxxxxxxxxxx> > Cc: Peter Xu <peterx@xxxxxxxxxx> > Cc: <stable@xxxxxxxxxxxxxxx> [3.18+] > Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Signed-off-by: David Hildenbrand <david@xxxxxxxxxx> > --- > mm/mmap.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) Now queued up, thanks. greg k-h