Re: [RFC 0/7] introduce memory hinting API for external process

Brian Geffon <bgeffon@xxxxxxxxxx> · Tue, 21 May 2019 21:15:20 -0700

To expand on the ChromeOS use case we're in a very similar situation to Android. For example, the Chrome browser uses a separate process for each individual tab (with some exceptions) and over time many tabs remain open in a back-grounded or idle state. Given that we have a lot of information about the weight of a tab, when it was last active, etc, we can benefit tremendously from per-process reclaim. We're working on getting real world numbers but all of our initial testing shows very promising results.

On Tue, May 21, 2019 at 5:57 AM Shakeel Butt <shakeelb@xxxxxxxxxx> wrote:
On Mon, May 20, 2019 at 7:55 PM Anshuman Khandual

<anshuman.khandual@xxxxxxx> wrote:

>

>

>

> On 05/20/2019 10:29 PM, Tim Murray wrote:

> > On Sun, May 19, 2019 at 11:37 PM Anshuman Khandual

> > <anshuman.khandual@xxxxxxx> wrote:

> >>

> >> Or Is the objective here is reduce the number of processes which get killed by

> >> lmkd by triggering swapping for the unused memory (user hinted) sooner so that

> >> they dont get picked by lmkd. Under utilization for zram hardware is a concern

> >> here as well ?

> >

> > The objective is to avoid some instances of memory pressure by

> > proactively swapping pages that userspace knows to be cold before

> > those pages reach the end of the LRUs, which in turn can prevent some

> > apps from being killed by lmk/lmkd. As soon as Android userspace knows

> > that an application is not being used and is only resident to improve

> > performance if the user returns to that app, we can kick off

> > process_madvise on that process's pages (or some portion of those

> > pages) in a power-efficient way to reduce memory pressure long before

> > the system hits the free page watermark. This allows the system more

> > time to put pages into zram versus waiting for the watermark to

> > trigger kswapd, which decreases the likelihood that later memory

> > allocations will cause enough pressure to trigger a kill of one of

> > these apps.

>

> So this opens up bit of LRU management to user space hints. Also because the app

> in itself wont know about the memory situation of the entire system, new system

> call needs to be called from an external process.

>

> >

> >> Swapping out memory into zram wont increase the latency for a hot start ? Or

> >> is it because as it will prevent a fresh cold start which anyway will be slower

> >> than a slow hot start. Just being curious.

> >

> > First, not all swapped pages will be reloaded immediately once an app

> > is resumed. We've found that an app's working set post-process_madvise

> > is significantly smaller than what an app allocates when it first

> > launches (see the delta between pswpin and pswpout in Minchan's

> > results). Presumably because of this, faulting to fetch from zram does

>

> pswpin      417613    1392647     975034     233.00

> pswpout    1274224    2661731    1387507     108.00

>

> IIUC the swap-in ratio is way higher in comparison to that of swap out. Is that

> always the case ? Or it tend to swap out from an active area of the working set

> which faulted back again.

>

> > not seem to introduce a noticeable hot start penalty, not does it

> > cause an increase in performance problems later in the app's

> > lifecycle. I've measured with and without process_madvise, and the

> > differences are within our noise bounds. Second, because we're not

>

> That is assuming that post process_madvise() working set for the application is

> always smaller. There is another challenge. The external process should ideally

> have the knowledge of active areas of the working set for an application in

> question for it to invoke process_madvise() correctly to prevent such scenarios.

>

> > preemptively evicting file pages and only making them more likely to

> > be evicted when there's already memory pressure, we avoid the case

> > where we process_madvise an app then immediately return to the app and

> > reload all file pages in the working set even though there was no

> > intervening memory pressure. Our initial version of this work evicted

>

> That would be the worst case scenario which should be avoided. Memory pressure

> must be a parameter before actually doing the swap out. But pages if know to be

> inactive/cold can be marked high priority to be swapped out.

>

> > file pages preemptively and did cause a noticeable slowdown (~15%) for

> > that case; this patch set avoids that slowdown. Finally, the benefit

> > from avoiding cold starts is huge. The performance improvement from

> > having a hot start instead of a cold start ranges from 3x for very

> > small apps to 50x+ for larger apps like high-fidelity games.

>

> Is there any other real world scenario apart from this app based ecosystem where

> user hinted LRU management might be helpful ? Just being curious. Thanks for the

> detailed explanation. I will continue looking into this series.

Chrome OS is another real world use-case for this user hinted LRU

management approach by proactively reclaiming reclaim from tabs not

accessed by the user for some time.