On 17.07.21 04:57, Peter Collingbourne wrote:
Introduce a new syscall, refpage_create, which returns a file
descriptor which may be mapped using mmap. Such a mapping is similar
to an anonymous mapping, but instead of clean pages being backed by the
zero page, they are instead backed by a so-called reference page, whose
contents are specified using an argument to refpage_create. Loads from
the mapping will load directly from the reference page, and initial
stores to the mapping will copy-on-write from the reference page.
I'm wondering, does the target use case really require the COW
optimization like we have for the shared zeropage?
If we'd avoid having a reference page at all and only store the pattern,
we could significantly reduce the memory consumption when using a lot of
reference pages, especially per process multiple ones. I'm asking
because ...
Reference pages are useful in circumstances where anonymous mappings
combined with manual stores to memory would impose undesirable costs,
either in terms of performance or RSS. Use cases are focused on heap
allocators and include:
- Pattern initialization for the heap. This is where malloc(3) gives
you memory whose contents are filled with a non-zero pattern
byte, in order to help detect and mitigate bugs involving use
of uninitialized memory. Typically this is implemented by having
the allocator memset the allocation with the pattern byte before
returning it to the user, but for large allocations this can result
in a significant increase in RSS, especially for allocations that
are used sparsely. Even for dense allocations there is a needless
impact to startup performance when it may be better to amortize it
throughout the program. By creating allocations using a reference
page filled with the pattern byte, we can avoid these costs.
... I assume the first *sane* access to such a page is a write, and not
a read.
- Pre-tagged heap memory. Memory tagging [1] is an upcoming ARMv8.5
feature which allows for memory to be tagged in order to detect
certain kinds of memory errors with low overhead. In order to set
up an allocation to allow memory errors to be detected, the entire
allocation needs to have the same tag. The issue here is similar to
pattern initialization in the sense that large tagged allocations
will be expensive if the tagging is done up front. The idea is that
the allocator would create reference pages with each of the possible
memory tags, and use those reference pages for the large allocations.
... and here as well.
Having a first access being a read sound more like an actual BUG (e.g.,
detect and mitigate bugs), which doesn't scream for needing a
performance improvement or sacrificing a whole (unmovable/unswappable)
reference page.
So, what would you lose when not populating a real reference pages at
all and instead only populating the pattern when populating a fresh
page? (and populating a fresh page even on read faults)
--
Thanks,
David / dhildenb