Re: [PATCH 0/2] Fatal signal handing within uaccess faults

Mark Rutland <mark.rutland@xxxxxxx> · Tue, 22 Aug 2017 11:25:28 +0100

On Tue, Jul 11, 2017 at 03:16:28PM +0100, Mark Rutland wrote:
> Hi,
> 
> Arch maintainer tl;dr: most arch fault code doesn't handle fatal signals
> correctly, allowing unprivileged users to create an unkillable task which can
> lock up the system. Please check whether your arch is affected.

>From a glance at v4.13-rc5, I believe this affects:

	alpha, cris, hexagon, ia64, m68k, metag, microblaze, mips,
	mn10300, nios2, openrisc, parisc, sparc (32-bit), tile,
	unicore32, xtensa

... I'm not sure about:

	arc, s390, sparc (64-bit)

... and I think that the following are ok:

	powerpc, sh, x86

Thanks,
Mark.

> AFAICT, most arches don't correctly handle a fatal signal interrupting a
> uaccess fault. They attempt to bail out, returning to the faulting context
> without bothering to handle the fault, but forget to apply the uaccess fixup.
> Consequently, the uaccess gets replayed, and the same thing happens forver.
> 
> When this occurs, the relevant task never returns to userspace, never handles
> the fatal signal, and is stuck in an unkillable (though interruptible and
> preemptible) state. The task can inhibit forward progress of the rest of the
> system, leading to RCU stalls and lockups.
> 
> It's possible for an unprivileged user to trigger this deliberately using the
> userfaultfd syscall, as demonstrated by the test case at the end of this email
> (note: requires CONFIG_USERFAULTFD to be selected). I am not sure if this is
> the only way of triggering the issue.
> 
> I stumbled upon this while fuzzing arm64 with Syzkaller. I've verified that
> both arm and arm64 have the issue, and by inspection is seems that the majority
> of other architectures are affected.
> 
> It looks like this was fixed up for x86 in 2014 with commit:
> 
>   26178ec11ef3c6c8 ("x86: mm: consolidate VM_FAULT_RETRY handling")
> 
> ... but most other architectures never received a similar fixup.
> 
> The duplication (and divergence) of this logic is unfortunate. It's largely
> copy-paste code that could be consolidated under mm/.
> 
> Until we end up refactoring this, and so as to be sutiable for backporting,
> this series fixes arm and arm64 in-place. I've not touched other architectures
> as I don't have the relevant hardwre or arch knowledge.
> 
> Thanks,
> Mark.
> 
> ----
> #include <errno.h>
> #include <linux/userfaultfd.h>
> #include <stdio.h>
> #include <sys/ioctl.h>
> #include <sys/mman.h>
> #include <sys/syscall.h>
> #include <sys/vfs.h>
> #include <unistd.h>
> 
> int main(int argc, char *argv[])
> {
> 	void *mem;
> 	long pagesz;
> 	int uffd, ret;
> 	struct uffdio_api api = {
> 		.api = UFFD_API
> 	};
> 	struct uffdio_register reg;
>        
> 	pagesz = sysconf(_SC_PAGESIZE);
> 	if (pagesz < 0) {
> 		return errno;
> 	}
> 
> 	mem = mmap(NULL, pagesz, PROT_READ | PROT_WRITE,
> 		   MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> 	if (mem == MAP_FAILED)
> 		return errno;
> 
> 	uffd = syscall(__NR_userfaultfd, 0);
> 	if (uffd < 0)
> 		return errno;
> 
> 	ret = ioctl(uffd, UFFDIO_API, &api);
> 	if (ret < 0)
> 		return errno;
> 
> 	reg = (struct uffdio_register) {
> 		.range = {
> 			.start = (unsigned long)mem,
> 			.len = pagesz
> 		},
> 		.mode = UFFDIO_REGISTER_MODE_MISSING
> 	};
> 
> 	ret = ioctl(uffd, UFFDIO_REGISTER, &reg);
> 	if (ret < 0)
> 		return errno;
> 
> 	/*
> 	 * Force an arbitrary uaccess to memory monitored by the userfaultfd.
> 	 * This will block, but when a SIGKILL is sent, will consume all
> 	 * available CPU time without being killed, and may inhibit forward
> 	 * progress of the system.
> 	 */
> 	ret = fstatfs(0, (struct statfs *)mem);
> 
> 	return 0;
> }
> ----
> 
> Mark Rutland (2):
>   arm64: mm: abort uaccess retries upon fatal signal
>   arm: mm: abort uaccess retries upon fatal signal
> 
>  arch/arm/mm/fault.c   | 5 ++++-
>  arch/arm64/mm/fault.c | 5 ++++-
>  2 files changed, 8 insertions(+), 2 deletions(-)
> 
> -- 
> 1.9.1
>