On Thu, Apr 21, 2022 at 8:06 AM Borislav Petkov <bp@xxxxxxxxx> wrote: > > on AMD zen3 > > original: 20.11 Gb/s > rep_good: 34.662 Gb/s > erms: 36.378 Gb/s > fsrm: 36.398 Gb/s Looks good. Of course, the interesting cases are the "took a page fault in the middle" ones. A very simple basic test is something like the attached. It does no error checking or anything else, but doing a 'strace ./a.out' should give you something like ... openat(AT_FDCWD, "/dev/zero", O_RDONLY) = 3 mmap(NULL, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f10ddfd0000 munmap(0x7f10ddfe0000, 65536) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 65536) = 16 exit_group(16) = ? where that "read(..) = 16" is the important part. It correctly figured out that it can only do 16 bytes (ok, 17, but we've always allowed the user accessor functions to block). With erms/fsrm, presumably you get that optimal "read(..) = 17". I'm sure we have a test-case for this somewhere, but it was easier for me to write a few lines of (bad) code than try to find it. Linus
#include <stddef.h> #include <unistd.h> #include <fcntl.h> #include <sys/mman.h> // Whatever #define PAGE_SIZE 65536 int main(int argc, char **argv) { int fd; void *map; fd = open("/dev/zero", O_RDONLY); map = mmap(NULL, 3*PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE |MAP_ANONYMOUS , -1, 0); munmap(map + PAGE_SIZE, PAGE_SIZE); return read(fd, map + PAGE_SIZE - 17, PAGE_SIZE); }