Thank you so much for your attention Michal,
Are there any settings (such as sysctl parameters) that I can use to better control the memory reclaiming? Such as: defining the max. amount of mmap pages allocated or max. amount of memory used by mmap pages?
Or will the system start reclaiming only when it needs more memory?
I found that I could use madvise with MADV_DONTNEED in order to actively free RSS memory used by mmap pages, but it would add more complexity on my software.
On Wed, Jun 6, 2018 at 9:43 AM Michal Hocko <mhocko@xxxxxxxxxx> wrote:
On Tue 05-06-18 16:14:02, Rafael Telles wrote:
> Hi there, I am running a program where I need to map hundreds of thousands
> of files and each file has several kilobytes (min. of 4kb per file). The
> program calls mmap() for every 4096 bytes on each file, ending up with
> millions of memory mapped pages, so I have ceil(N/4096) pages for each
> file, where N is the file size.
>
> As the program runs, more files are created and the older files get bigger,
> then I need to remap those pages, so it's always adding more pages.
>
> I am concerned about when and how Linux is going to swap out pages in order
> to get more memory, the program seems to only increase memory usage overall
> and I am afraid it runs out of memory.
We definitely do reclaim mmaped memory - be it a page cache or anonymous
memory. The code doing that is mostly in shrink_page_list (resp.
page_check_references for aging decisions) - somehow non-trivial to
follow but you know where to start looking at least ;)
> I tried setting these sysctl parameters so it would swap out as soon as
> possible (just to understand how Linux memory management works), but it
> didn't change anything:
>
> vm.zone_reclaim_mode = 1
This will make difference only for NUMA machines and it will try to
keep allocations to local nodes. It can lead to a more extensive
reclaim but I would definitely not recommend setting it up unless you
want a strong NUMA locality payed by reclaiming more while the rest of
the memory might be sitting idle.
> vm.min_unmapped_ratio = 99
This one is active only for the zone/node reclaim and tells whether to
reclaim the specific node based on how much of memory is mapped. Your
setting would tell that the node is not worth to be reclaimed unless 99%
of it is clean page cache (the behavior depends on the zone_reclaim_mode
because zone_reclaim_mode = 1 excludes mapped pages AFAIR).
So this will most likely not do what you think.
> How can I be sure the program won't run out of memory?
The default overcommit setting should not allow you to mmap too much in
many cases.
> Do I have to manually unmap pages to free memory?
No.
--
Michal Hocko
SUSE Labs