Re: [PATCH v5 0/7] psi: pressure stall monitors v5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 19, 2019 at 3:51 PM Minchan Kim <minchan@xxxxxxxxxx> wrote:
>
> On Fri, Mar 08, 2019 at 10:43:04AM -0800, Suren Baghdasaryan wrote:
> > This is respin of:
> >   https://lwn.net/ml/linux-kernel/20190206023446.177362-1-surenb%40google.com/
> >
> > Android is adopting psi to detect and remedy memory pressure that
> > results in stuttering and decreased responsiveness on mobile devices.
> >
> > Psi gives us the stall information, but because we're dealing with
> > latencies in the millisecond range, periodically reading the pressure
> > files to detect stalls in a timely fashion is not feasible. Psi also
> > doesn't aggregate its averages at a high-enough frequency right now.
> >
> > This patch series extends the psi interface such that users can
> > configure sensitive latency thresholds and use poll() and friends to
> > be notified when these are breached.
> >
> > As high-frequency aggregation is costly, it implements an aggregation
> > method that is optimized for fast, short-interval averaging, and makes
> > the aggregation frequency adaptive, such that high-frequency updates
> > only happen while monitored stall events are actively occurring.
> >
> > With these patches applied, Android can monitor for, and ward off,
> > mounting memory shortages before they cause problems for the user.
> > For example, using memory stall monitors in userspace low memory
> > killer daemon (lmkd) we can detect mounting pressure and kill less
> > important processes before device becomes visibly sluggish. In our
> > memory stress testing psi memory monitors produce roughly 10x less
> > false positives compared to vmpressure signals. Having ability to
> > specify multiple triggers for the same psi metric allows other parts
> > of Android framework to monitor memory state of the device and act
> > accordingly.
> >
> > The new interface is straight-forward. The user opens one of the
> > pressure files for writing and writes a trigger description into the
> > file descriptor that defines the stall state - some or full, and the
> > maximum stall time over a given window of time. E.g.:
> >
> >         /* Signal when stall time exceeds 100ms of a 1s window */
> >         char trigger[] = "full 100000 1000000"
> >         fd = open("/proc/pressure/memory")
> >         write(fd, trigger, sizeof(trigger))
> >         while (poll() >= 0) {
> >                 ...
> >         };
> >         close(fd);
> >
> > When the monitored stall state is entered, psi adapts its aggregation
> > frequency according to what the configured time window requires in
> > order to emit event signals in a timely fashion. Once the stalling
> > subsides, aggregation reverts back to normal.
> >
> > The trigger is associated with the open file descriptor. To stop
> > monitoring, the user only needs to close the file descriptor and the
> > trigger is discarded.
> >
> > Patches 1-6 prepare the psi code for polling support. Patch 7 implements
> > the adaptive polling logic, the pressure growth detection optimized for
> > short intervals, and hooks up write() and poll() on the pressure files.
> >
> > The patches were developed in collaboration with Johannes Weiner.
> >
> > The patches are based on 5.0-rc8 (Merge tag 'drm-next-2019-03-06').
> >
> > Suren Baghdasaryan (7):
> >   psi: introduce state_mask to represent stalled psi states
> >   psi: make psi_enable static
> >   psi: rename psi fields in preparation for psi trigger addition
> >   psi: split update_stats into parts
> >   psi: track changed states
> >   refactor header includes to allow kthread.h inclusion in psi_types.h
> >   psi: introduce psi monitor
> >
> >  Documentation/accounting/psi.txt | 107 ++++++
> >  include/linux/kthread.h          |   3 +-
> >  include/linux/psi.h              |   8 +
> >  include/linux/psi_types.h        | 105 +++++-
> >  include/linux/sched.h            |   1 -
> >  kernel/cgroup/cgroup.c           |  71 +++-
> >  kernel/kthread.c                 |   1 +
> >  kernel/sched/psi.c               | 613 ++++++++++++++++++++++++++++---
> >  8 files changed, 833 insertions(+), 76 deletions(-)
> >
> > Changes in v5:
> > - Fixed sparse: error: incompatible types in comparison expression, as per
> >  Andrew
> > - Changed psi_enable to static, as per Andrew
> > - Refactored headers to be able to include kthread.h into psi_types.h
> > without creating a circular inclusion, as per Johannes
> > - Split psi monitor from aggregator, used RT worker for psi monitoring to
> > prevent it being starved by other RT threads and memory pressure events
> > being delayed or lost, as per Minchan and Android Performance Team
> > - Fixed blockable memory allocation under rcu_read_lock inside
> > psi_trigger_poll by using refcounting, as per Eva Huang and Minchan
> > - Misc cleanup and improvements, as per Johannes
> >
> > Notes:
> > 0001-psi-introduce-state_mask-to-represent-stalled-psi-st.patch is unchanged
> > from the previous version and provided for completeness.
>
> Please fix kbuild test bot's warning in 6/7
> Other than that, for all patches,

Thanks for the review!
Pushed v6 with the fix for the warning: https://lkml.org/lkml/2019/3/19/987
Also fixed a bug introduced in https://lkml.org/lkml/2019/3/8/686
which I discovered while testing (description in the changelog of the
new patchset).

>
> Acked-by: Minchan Kim <minchan@xxxxxxxxxx>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux