On 28.02.24 23:56, Khalid Aziz wrote:
Threads of a process share address space and page tables that allows for
two key advantages:
1. Amount of memory required for PTEs to map physical pages stays low
even when large number of threads share the same pages since PTEs are
shared across threads.
2. Page protection attributes are shared across threads and a change
of attributes applies immediately to every thread without any overhead
of coordinating protection bit changes across threads.
These advantages no longer apply when unrelated processes share pages.
Large database applications can easily comprise of 1000s of processes
that share 100s of GB of pages. In cases like this, amount of memory
consumed by page tables can exceed the size of actual shared data.
On a database server with 300GB SGA, a system crash was seen with
out-of-memory condition when 1500+ clients tried to share this SGA even
though the system had 512GB of memory. On this server, in the worst case
scenario of all 1500 processes mapping every page from SGA would have
required 878GB+ for just the PTEs.
I have sent proposals and patches to solve this problem by adding a
mechanism to the kernel for processes to use to opt into sharing
page tables with other processes. We have had discussions on original
proposal and subsequent refinements but we have not converged on a
solution. As systems with multi-TB memory and in-memory databases
are becoming more and more common, this is becoming a significant issue.
An interactive discussion can help us reach a consensus on how to
solve this.
Hi,
I was hoping for a follow-up to my previous comments from ~4 months ago
[1], so one problem of "not converging" might be "no follow-up discussion".
Ideally, this session would not focus on mshare as previously discussed
at LSF/MM, but take a step back and discuss requirements and possible
adjustments to the original concept to get something possibly cleaner.
For example, I raised some ideas to not having to re-route
mprotect()/mmap() calls. At least discussing somehwere why they are all
bad would be helpful ;)
[1]
https://lore.kernel.org/lkml/927b6339-ac5f-480c-9cdc-49c838cbef20@xxxxxxxxxx/
--
Cheers,
David / dhildenb