On Fri, Jun 04, 2021 at 12:43:22AM -0700, Ming Lin wrote: > Adds new flag MAP_NOSIGBUS of mmap() to specify the behavior of > "don't SIGBUS on fault". Right now, this flag is only allowed > for private mapping. > > For MAP_NOSIGBUS mapping, map in the zero page on read fault > or fill a freshly allocated page with zeroes on write fault. I am wondering if this could be of limited use for me if MAP_NOSIGBUS were to be supported for shared mappings as well. When virtiofs is run with dax enabled, then it is possible that if a file is shared between two guests, then one guest truncates the file and second guest tries to do load/store operation. Given current kvm architecture, there is no mechanism to propagate SIGBUS to guest process, instead KVM retries page fault infinitely and guest cpu/process hangs. Ideally we want this error to propagate all the way back into the guest and to the guest process but that solution is not in place yet. https://lore.kernel.org/kvm/20200406190951.GA19259@xxxxxxxxxx/ In the absense of a proper solution, one could think of mapping shared file on host with MAP_NOSIGBUS, and hopefully that means kvm will be able to resolve fault to a zero filled page and guest will not hang. But this means that data sharing between two processes is now broken. Writes by process A will not be visible to process B in another once this situation happens, IIUC. So if we were to MAP_NOSIGBUS, guest will not hang but failures resulting from ftruncate will be silent and will be noticed sometime later. I guess not exactly a very pleasant scenario... Thanks Vivek > > Signed-off-by: Ming Lin <mlin@xxxxxxxxxx> > --- > arch/parisc/include/uapi/asm/mman.h | 1 + > include/linux/mm.h | 2 ++ > include/linux/mman.h | 1 + > include/uapi/asm-generic/mman-common.h | 1 + > mm/memory.c | 11 +++++++++++ > mm/mmap.c | 4 ++++ > tools/include/uapi/asm-generic/mman-common.h | 1 + > 7 files changed, 21 insertions(+) > > diff --git a/arch/parisc/include/uapi/asm/mman.h b/arch/parisc/include/uapi/asm/mman.h > index ab78cba..eecf9af 100644 > --- a/arch/parisc/include/uapi/asm/mman.h > +++ b/arch/parisc/include/uapi/asm/mman.h > @@ -25,6 +25,7 @@ > #define MAP_STACK 0x40000 /* give out an address that is best suited for process/thread stacks */ > #define MAP_HUGETLB 0x80000 /* create a huge page mapping */ > #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ > +#define MAP_NOSIGBUS 0x200000 /* do not SIGBUS on fault */ > #define MAP_UNINITIALIZED 0 /* uninitialized anonymous mmap */ > > #define MS_SYNC 1 /* synchronous memory sync */ > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 9e86ca1..100d122 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -373,6 +373,8 @@ int __add_to_page_cache_locked(struct page *page, struct address_space *mapping, > # define VM_UFFD_MINOR VM_NONE > #endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */ > > +#define VM_NOSIGBUS VM_FLAGS_BIT(38) /* Do not SIGBUS on fault */ > + > /* Bits set in the VMA until the stack is in its final location */ > #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ) > > diff --git a/include/linux/mman.h b/include/linux/mman.h > index b2cbae9..c966b08 100644 > --- a/include/linux/mman.h > +++ b/include/linux/mman.h > @@ -154,6 +154,7 @@ static inline bool arch_validate_flags(unsigned long flags) > _calc_vm_trans(flags, MAP_DENYWRITE, VM_DENYWRITE ) | > _calc_vm_trans(flags, MAP_LOCKED, VM_LOCKED ) | > _calc_vm_trans(flags, MAP_SYNC, VM_SYNC ) | > + _calc_vm_trans(flags, MAP_NOSIGBUS, VM_NOSIGBUS ) | > arch_calc_vm_flag_bits(flags); > } > > diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h > index f94f65d..a2a5333 100644 > --- a/include/uapi/asm-generic/mman-common.h > +++ b/include/uapi/asm-generic/mman-common.h > @@ -29,6 +29,7 @@ > #define MAP_HUGETLB 0x040000 /* create a huge page mapping */ > #define MAP_SYNC 0x080000 /* perform synchronous page faults for the mapping */ > #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ > +#define MAP_NOSIGBUS 0x200000 /* do not SIGBUS on fault */ > > #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be > * uninitialized */ > diff --git a/mm/memory.c b/mm/memory.c > index 8d5e583..6b5a897 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -3676,6 +3676,17 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) > } > > ret = vma->vm_ops->fault(vmf); > + if (unlikely(ret & VM_FAULT_SIGBUS) && (vma->vm_flags & VM_NOSIGBUS)) { > + /* > + * For MAP_NOSIGBUS mapping, map in the zero page on read fault > + * or fill a freshly allocated page with zeroes on write fault > + */ > + ret = do_anonymous_page(vmf); > + if (!ret) > + ret = VM_FAULT_NOPAGE; > + return ret; > + } > + > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY | > VM_FAULT_DONE_COW))) > return ret; > diff --git a/mm/mmap.c b/mm/mmap.c > index 8bed547..d5c9fb5 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1419,6 +1419,10 @@ unsigned long do_mmap(struct file *file, unsigned long addr, > if (!len) > return -EINVAL; > > + /* Restrict MAP_NOSIGBUS to MAP_PRIVATE mapping */ > + if ((flags & MAP_NOSIGBUS) && !(flags & MAP_PRIVATE)) > + return -EINVAL; > + > /* > * Does the application expect PROT_READ to imply PROT_EXEC? > * > diff --git a/tools/include/uapi/asm-generic/mman-common.h b/tools/include/uapi/asm-generic/mman-common.h > index f94f65d..a2a5333 100644 > --- a/tools/include/uapi/asm-generic/mman-common.h > +++ b/tools/include/uapi/asm-generic/mman-common.h > @@ -29,6 +29,7 @@ > #define MAP_HUGETLB 0x040000 /* create a huge page mapping */ > #define MAP_SYNC 0x080000 /* perform synchronous page faults for the mapping */ > #define MAP_FIXED_NOREPLACE 0x100000 /* MAP_FIXED which doesn't unmap underlying mapping */ > +#define MAP_NOSIGBUS 0x200000 /* do not SIGBUS on fault */ > > #define MAP_UNINITIALIZED 0x4000000 /* For anonymous mmap, memory could be > * uninitialized */ > -- > 1.8.3.1 >