Re: [PATCH v4 00/10] Add support for synchronous signals on perf events

Marco Elver <elver@xxxxxxxxxx> · Tue, 11 Jun 2024 11:18:57 +0200

On Thu, Apr 08, 2021 at 12:35PM +0200, Marco Elver wrote:
[...]
> Motivation and Example Uses
> ---------------------------
> 
> 1. 	Our immediate motivation is low-overhead sampling-based race
> 	detection for user space [1]. By using perf_event_open() at
> 	process initialization, we can create hardware
> 	breakpoint/watchpoint events that are propagated automatically
> 	to all threads in a process. As far as we are aware, today no
> 	existing kernel facility (such as ptrace) allows us to set up
> 	process-wide watchpoints with minimal overheads (that are
> 	comparable to mprotect() of whole pages).
> 
> 2.	Other low-overhead error detectors that rely on detecting
> 	accesses to certain memory locations or code, process-wide and
> 	also only in a specific set of subtasks or threads.
> 
> [1] https://llvm.org/devmtg/2020-09/slides/Morehouse-GWP-Tsan.pdf
> 
> Other ideas for use-cases we found interesting, but should only
> illustrate the range of potential to further motivate the utility (we're
> sure there are more):
> 
> 3.	Code hot patching without full stop-the-world. Specifically, by
> 	setting a code breakpoint to entry to the patched routine, then
> 	send signals to threads and check that they are not in the
> 	routine, but without stopping them further. If any of the
> 	threads will enter the routine, it will receive SIGTRAP and
> 	pause.
> 
> 4.	Safepoints without mprotect(). Some Java implementations use
> 	"load from a known memory location" as a safepoint. When threads
> 	need to be stopped, the page containing the location is
> 	mprotect()ed and threads get a signal. This could be replaced with
> 	a watchpoint, which does not require a whole page nor DTLB
> 	shootdowns.
> 
> 5.	Threads receiving signals on performance events to
> 	throttle/unthrottle themselves.
> 
> 6.	Tracking data flow globally.

For future reference:

I often wonder what happened to some new kernel feature, and how people
are using it. I'm guessing there must be other users of "synchronous
signals on perf events" somewhere by now (?), but the reason the whole
thing started was because points #1 and #2 above.

Now 3 years later we were able to open source a framework that does #1
and #2 and more: https://github.com/google/gwpsan - "A framework for
low-overhead sampling-based dynamic binary instrumentation, designed for
implementing various bug detectors (also called "sanitizers") suitable
for production uses. GWPSan does not modify the executed code, but
instead performs dynamic analysis from signal handlers."

Documentation is sparse, it's still in development, and probably has
numerous sharp corners right now...

That being said, the code demonstrates how low-overhead "process-wide
synchronous event handling" thanks to perf events can be used to
implement crazier things outside the realm of performance profiling.

Thanks!

-- Marco