Re: [PATCH RFC 0/4] Add support for synchronous signals on perf events

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 23 Feb 2021 at 15:34, Marco Elver <elver@xxxxxxxxxx> wrote:

The perf subsystem today unifies various tracing and monitoring
features, from both software and hardware. One benefit of the perf
subsystem is automatically inheriting events to child tasks, which
enables process-wide events monitoring with low overheads. By default
perf events are non-intrusive, not affecting behaviour of the tasks
being monitored.

For certain use-cases, however, it makes sense to leverage the
generality of the perf events subsystem and optionally allow the tasks
being monitored to receive signals on events they are interested in.
This patch series adds the option to synchronously signal user space on
events.

The discussion at [1] led to the changes proposed in this series. The
approach taken in patch 3/4 to use 'event_limit' to trigger the signal
was kindly suggested by Peter Zijlstra in [2].

[1] https://lore.kernel.org/lkml/CACT4Y+YPrXGw+AtESxAgPyZ84TYkNZdP0xpocX2jwVAbZD=-XQ@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/lkml/YBv3rAT566k+6zjg@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

Motivation and example uses:

1.      Our immediate motivation is low-overhead sampling-based race
        detection for user-space [3]. By using perf_event_open() at
        process initialization, we can create hardware
        breakpoint/watchpoint events that are propagated automatically
        to all threads in a process. As far as we are aware, today no
        existing kernel facility (such as ptrace) allows us to set up
        process-wide watchpoints with minimal overheads (that are
        comparable to mprotect() of whole pages).

        [3] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf

2.      Other low-overhead error detectors that rely on detecting
        accesses to certain memory locations or code, process-wide and
        also only in a specific set of subtasks or threads.

Other example use-cases we found potentially interesting:

3.      Code hot patching without full stop-the-world. Specifically, by
        setting a code breakpoint to entry to the patched routine, then
        send signals to threads and check that they are not in the
        routine, but without stopping them further. If any of the
        threads will enter the routine, it will receive SIGTRAP and
        pause.

4.      Safepoints without mprotect(). Some Java implementations use
        "load from a known memory location" as a safepoint. When threads
        need to be stopped, the page containing the location is
        mprotect()ed and threads get a signal. This can be replaced with
        a watchpoint, which does not require a whole page nor DTLB
        shootdowns.

5.      Tracking data flow globally.

6.      Threads receiving signals on performance events to
        throttle/unthrottle themselves.


Marco Elver (4):
  perf/core: Apply PERF_EVENT_IOC_MODIFY_ATTRIBUTES to children
  signal: Introduce TRAP_PERF si_code and si_perf to siginfo
  perf/core: Add support for SIGTRAP on perf events
  perf/core: Add breakpoint information to siginfo on SIGTRAP

Note that we're currently pondering fork + exec, and suggestions would
be appreciated. We think we'll need some restrictions, like Peter
proposed here: here:
https://lore.kernel.org/lkml/YBvj6eJR%2FDY2TsEB@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/

We think what we want is to inherit the events to children only if
cloned with CLONE_SIGHAND. If there's space for a 'inherit_mask' in
perf_event_attr, that'd be most flexible, but perhaps we do not have
the space.

Thanks,
-- Marco


 arch/m68k/kernel/signal.c          |  3 ++
 arch/x86/kernel/signal_compat.c    |  5 ++-
 fs/signalfd.c                      |  4 +++
 include/linux/compat.h             |  2 ++
 include/linux/signal.h             |  1 +
 include/uapi/asm-generic/siginfo.h |  6 +++-
 include/uapi/linux/perf_event.h    |  3 +-
 include/uapi/linux/signalfd.h      |  4 ++-
 kernel/events/core.c               | 54 +++++++++++++++++++++++++++++-
 kernel/signal.c                    | 11 ++++++
 10 files changed, 88 insertions(+), 5 deletions(-)

--
2.30.0.617.g56c4b15f3c-goog




[Index of Archives]     [Video for Linux]     [Yosemite News]     [Linux S/390]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux