Re: [PATCH v5] mm: introduce reference pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17.07.21 04:57, Peter Collingbourne wrote:
Introduce a new syscall, refpage_create, which returns a file
descriptor which may be mapped using mmap. Such a mapping is similar
to an anonymous mapping, but instead of clean pages being backed by the
zero page, they are instead backed by a so-called reference page, whose
contents are specified using an argument to refpage_create. Loads from
the mapping will load directly from the reference page, and initial
stores to the mapping will copy-on-write from the reference page.

I'm wondering, does the target use case really require the COW optimization like we have for the shared zeropage?

If we'd avoid having a reference page at all and only store the pattern, we could significantly reduce the memory consumption when using a lot of reference pages, especially per process multiple ones. I'm asking because ...


Reference pages are useful in circumstances where anonymous mappings
combined with manual stores to memory would impose undesirable costs,
either in terms of performance or RSS. Use cases are focused on heap
allocators and include:

- Pattern initialization for the heap. This is where malloc(3) gives
   you memory whose contents are filled with a non-zero pattern
   byte, in order to help detect and mitigate bugs involving use
   of uninitialized memory. Typically this is implemented by having
   the allocator memset the allocation with the pattern byte before
   returning it to the user, but for large allocations this can result
   in a significant increase in RSS, especially for allocations that
   are used sparsely. Even for dense allocations there is a needless
   impact to startup performance when it may be better to amortize it
   throughout the program. By creating allocations using a reference
   page filled with the pattern byte, we can avoid these costs.

... I assume the first *sane* access to such a page is a write, and not a read.


- Pre-tagged heap memory. Memory tagging [1] is an upcoming ARMv8.5
   feature which allows for memory to be tagged in order to detect
   certain kinds of memory errors with low overhead. In order to set
   up an allocation to allow memory errors to be detected, the entire
   allocation needs to have the same tag. The issue here is similar to
   pattern initialization in the sense that large tagged allocations
   will be expensive if the tagging is done up front. The idea is that
   the allocator would create reference pages with each of the possible
   memory tags, and use those reference pages for the large allocations.

... and here as well.

Having a first access being a read sound more like an actual BUG (e.g., detect and mitigate bugs), which doesn't scream for needing a performance improvement or sacrificing a whole (unmovable/unswappable) reference page.

So, what would you lose when not populating a real reference pages at all and instead only populating the pattern when populating a fresh page? (and populating a fresh page even on read faults)


--
Thanks,

David / dhildenb




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux