On Thu, Apr 21, 2022 at 10:22:32AM -0700, Linus Torvalds wrote: > I think 'copy_{to,from}_user()' actually does go to the effort to try > to do byte-exact results, though. Yeah, we have had headaches with this byte-exact copying, wrt MCEs. > In particular, see copy_user_handle_tail in arch/x86/lib/copy_user_64.S. > > But I think that we long ago ended up deciding it really wasn't worth > doing it, and x86 ends up just going to unnecessary lengths for this > case. You could give me some more details but AFAIU, you mean, that fallback to byte-sized reads is unnecessary and I can get rid of copy_user_handle_tail? Because that would be a nice cleanup. Anyway, I ran your short prog and it all looks like you predicted it: fsrm: ---- openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fafd78fe000 munmap(0x7fafd790e000, 65536) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 17 exit_group(17) = ? +++ exited with 17 +++ erms: ----- openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fe0b5321000 munmap(0x7fe0b5331000, 65536) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 17 exit_group(17) = ? +++ exited with 17 +++ rep_good: --------- openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0b5f0c7000 munmap(0x7f0b5f0d7000, 65536) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 16 exit_group(16) = ? +++ exited with 16 +++ original: --------- openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3ff61c6000 munmap(0x7f3ff61d6000, 65536) = 0 read(3, strace: umoven: short read (17 < 33) @0x7f3ff61d5fef 0x7f3ff61d5fef, 65536) = 3586 exit_group(3586) = ? +++ exited with 2 +++ that "umoven: short read" is strace spitting out something about the address space of the tracee being unaccessible. But the 17 bytes short read is still there. >From strace sources: /* legacy method of copying from tracee */ static int umoven_peekdata(const int pid, kernel_ulong_t addr, unsigned int len, void *laddr) { ... switch (errno) { case EFAULT: case EIO: case EPERM: /* address space is inaccessible */ if (nread) { perror_msg("umoven: short read (%u < %u) @0x%" PRI_klx, nread, nread + len, addr - nread); } return -1; -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette