Re: [PATCH v2 1/3] x86/mce: Avoid infinite loop for copy from user recovery

"Luck, Tony" <tony.luck@xxxxxxxxx> · Mon, 23 Aug 2021 08:24:37 -0700

On Sun, Aug 22, 2021 at 04:46:14PM +0200, Borislav Petkov wrote:
> On Fri, Aug 20, 2021 at 01:33:56PM -0700, Luck, Tony wrote:
> > The new version (thanks to All fixing iov_iter.c) now does
> > exactly what POSIX says should happen.  If I have a buffer
> > with poison at offset 213, and I do this:
> > 
> > 	ret = write(fd, buf, 512);
> > 
> > Then the return from write is 213, and the first 213 bytes
> > from the buffer appear in the file, and the file size is
> > incremented by 213 (assuming the write started with the lseek
> > offset at the original size of the file).
> 
> ... and the user still gets a SIGBUS so that it gets a chance to handle
> the encountered poison? I.e., not retry the write for the remaining 512
> - 213 bytes?

Whether the user gets a SIGBUS depends on what they do next.  In a typical
user loop trying to do a write:

	while (nbytes) {
		ret = write(fd, buf, nbytes);
		if (ret == -1)
			return ret;
		buf += ret;
		nbytes -= ret;
	}

The next iteration after the short write caused by the machine check
will return ret == -1, errno = EFAULT.

Andy Lutomirski convinced me that the kernel should not send a SIGBUS
to an application when the kernel accesses the poison in user memory.

If the user tries to access the page with the poison directly they'll
get a SIGBUS (page was unmapped so user gets a #PF, but the x86 fault
handler sees that the page was unmapped because of poison, so sends a
SIGBUS).

> If so, do we document that somewhere so that application writers can
> know what they should do in such cases?

Applications see a failed write ... they should do whatever they would
normally do for a failed write.

-Tony