On Fri, Aug 20, 2021 at 09:27:44PM +0200, Borislav Petkov wrote: > On Fri, Aug 20, 2021 at 11:59:45AM -0700, Luck, Tony wrote: > As in: there was an MCE while trying to access this user memory, you > should not do get_user anymore. You did add that > > * Return zero to pretend that this copy succeeded. This > * is counter-intuitive, but needed to prevent the code > * in lib/iov_iter.c from retrying and running back into > > which you're removing with the last patch so I'm confused. Forget to address this part in the earlier reply. My original code that forced a zero return has a hack. It allowed recovery to complete, but only because there was going to be a SIGBUS. There were some unplesant side effects. E.g. on a write syscall the file size was updated as if the write had succeeded. That would be very confusing for anyone trying to clean up afterwards as the file would have good data that was copied from the user up to the point where the machine check interrupted things. Then NUL bytes after (because the kernel clears pages that are allocated into the page cache). The new version (thanks to All fixing iov_iter.c) now does exactly what POSIX says should happen. If I have a buffer with poison at offset 213, and I do this: ret = write(fd, buf, 512); Then the return from write is 213, and the first 213 bytes from the buffer appear in the file, and the file size is incremented by 213 (assuming the write started with the lseek offset at the original size of the file). -Tony