On Tue, Dec 29, 2020 at 4:41 PM Peter Collingbourne <pcc@xxxxxxxxxx> wrote: > > In the Scudo memory allocator [1] we would like to be able to > detect use-after-free vulnerabilities involving large allocations > by issuing mprotect(PROT_NONE) on the memory region used for the > allocation when it is deallocated. Later on, after the memory > region has been "quarantined" for a sufficient period of time we > would like to be able to use it for another allocation by issuing > mprotect(PROT_READ|PROT_WRITE). > > Before this patch, after removing the write protection, any writes > to the memory region would result in page faults and entering > the copy-on-write code path, even in the usual case where the > pages are only referenced by a single PTE, harming performance > unnecessarily. Make it so that any pages in anonymous mappings that > are only referenced by a single PTE are immediately made writable > during the mprotect so that we can avoid the page faults. > > This program shows the critical syscall sequence that we intend to > use in the allocator: > > #include <string.h> > #include <sys/mman.h> > > enum { kSize = 131072 }; > > int main(int argc, char **argv) { > char *addr = (char *)mmap(0, kSize, PROT_READ | PROT_WRITE, > MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > for (int i = 0; i != 100000; ++i) { > memset(addr, i, kSize); > mprotect((void *)addr, kSize, PROT_NONE); > mprotect((void *)addr, kSize, PROT_READ | PROT_WRITE); > } > } > > The effect of this patch on the above program was measured on a > DragonBoard 845c by taking the median real time execution time of > 10 runs. > > Before: 3.19s > After: 0.79s > > The effect was also measured using one of the microbenchmarks that > we normally use to benchmark the allocator [2], after modifying it > to make the appropriate mprotect calls [3]. With an allocation size > of 131072 bytes to trigger the allocator's "large allocation" code > path the per-iteration time was measured as follows: > > Before: 33364ns > After: 6886ns > > This patch means that we do more work during the mprotect call itself > in exchange for less work when the pages are accessed. In the worst > case, the pages are not accessed at all. The effect of this patch in > such cases was measured using the following program: > > #include <string.h> > #include <sys/mman.h> > > enum { kSize = 131072 }; > > int main(int argc, char **argv) { > char *addr = (char *)mmap(0, kSize, PROT_READ | PROT_WRITE, > MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); > memset(addr, 1, kSize); > for (int i = 0; i != 100000; ++i) { > #ifdef PAGE_FAULT > memset(addr + (i * 4096) % kSize, i, 4096); > #endif > mprotect((void *)addr, kSize, PROT_NONE); > mprotect((void *)addr, kSize, PROT_READ | PROT_WRITE); > } > } > > With PAGE_FAULT undefined (0 pages touched after removing write > protection) the median real time execution time of 100 runs was > measured as follows: > > Before: 0.325928s > After: 0.365493s > > With PAGE_FAULT defined (1 page touched) the measurements were > as follows: > > Before: 0.441516s > After: 0.380251s > > So it seems that even with a single page fault the new approach > is faster. > > I saw similar results if I adjusted the programs to use a larger > mapping size. With kSize = 1048576 I get these numbers with PAGE_FAULT > undefined: > > Before: 1.563078s > After: 1.607476s > > i.e. around 3%. > > And these with PAGE_FAULT defined: > > Before: 1.684663s > After: 1.683272s > > i.e. about the same. > > What I think we may conclude from these results is that for smaller > mappings the advantage of the previous approach, although measurable, > is wiped out by a single page fault. I think we may expect that there > should be at least one access resulting in a page fault (under the > previous approach) after making the pages writable, since the program > presumably made the pages writable for a reason. > > For larger mappings we may guesstimate that the new approach wins if > the density of future page faults is > 0.4%. But for the mappings that > are large enough for density to matter (not just the absolute number > of page faults) it doesn't seem like the increase in mprotect latency > would be very large relative to the total mprotect execution time. > > Signed-off-by: Peter Collingbourne <pcc@xxxxxxxxxx> > Link: https://linux-review.googlesource.com/id/I98d75ef90e20330c578871c87494d64b1df3f1b8 > Link: [1] https://source.android.com/devices/tech/debug/scudo > Link: [2] https://cs.android.com/android/platform/superproject/+/master:bionic/benchmarks/stdlib_benchmark.cpp;l=53;drc=e8693e78711e8f45ccd2b610e4dbe0b94d551cc9 > Link: [3] https://github.com/pcc/llvm-project/commit/scudo-mprotect-secondary Hi Andrew, We had some test failures on Android that were tracked down to this patch, so it should probably be backed out of -mm until the problem is resolved. Peter