Re: [RFC PATCH 0/2] tracing/user_events: Remote write ABI

Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> · Mon, 31 Oct 2022 14:25:59 -0400

On 2022-10-31 13:27, Beau Belgrave wrote:

On Mon, Oct 31, 2022 at 11:15:56PM +0900, Masami Hiramatsu wrote:

[...]

And what is the actual advantage of this change? Are there any issue
to use mmaped page? I would like to know more background of this
change.

Without this change user tracers like LTTng will have to check 2 values
instead of 1 to tell if the kernel tracer is enabled or not. Mathieu is
working on a user side tracing library in an effort to align writing
tracing code in user processes that works well for both kernel and user
tracers without much effort.

See here:
https://github.com/compudj/side

Are you proposing we keep the bitmap approach and have side library just
hook another branch? Mathieu had issues with that approach during our
talks.

As overhead of the disabled tracepoints was a key factor in having the Linux
kernel adopt tracepoints when I created those back in 2008, I expect that having
minimal overhead in the disabled case will also prove to be a key factor for
adoption by user-space applications.

Another aspect that seems to be very important for wide adoption by user-space
is that the instrumentation library needs to have a license that is very
convenient for inclusion into statically linked software without additional
license requirements. This therefore excludes GPL and LGPL. I've used the MIT
license for the "side" project for that purpose.

Indeed, my ideal scenario is to use asm goto and implement something similar
to jump labels in user-space so the instrumentation only costs a no-op or a
jump when instrumentation is disabled. That can only be used in contexts where
code patching is allowed though (not for Runtime Integrity Checked (RIC) processes).

My next-to-best scenario is to have a single load (from fixed offset), test and
conditional branch in the userspace fast-path instead. This approach will need
to be made available as a fall-back for processes which are flagged as RIC-protected.

I currently focus my efforts on the load+test+conditional branch scheme, which is
somewhat simpler than the code patching approach in terms of needed infrastructure.

If we go for the current user events bitmap approach, then anything we do from
userspace will have more overhead (additional pointer chasing, loads, and masks
to apply). And it pretty much rules out code patching.

In terms of missing pieces to allow code patching to be done in userspace, here
is what I think we'd need:

- Extend the "side" (MIT-licensed) library to implement gadgets which support
code patching, but fall-back to load+test+conditional branch if code patching
is not available. Roughly, those would look like (this is really just pseudo-code):

.pushsection side_jmp
/*
 * &side_enabled_value is the key used to change the enabled/disabled state.
 * 1f is the address of the code to patch.
 * 3f is the address of branch target when disabled.
 * 4f is the address of branch target when enabled.
 */
.quad &side_enabled_value, 1f, 3f, 4f
.popsection

/*
 * Place all jump instructions that will be modified by code patching into a
 * single section. Therefore, this will minimize the amount of COW required when
 * patching code from executables and shared libraries that have instances in
 * many processes.
 */
.pushsection side_jmp_modify_code (executable section)
1:
jump to 2f
.popsection

jump to 1b
2:
load side_enabled_value
test
cond. branch to 4
3:
-> disabled
4:
-> enabled

When loading the .so or the executable, the initial states uses the load,
test, conditional branch. Then in a constructor, if code patching is available,
the jump at label (1) can be updated to target (3) instead. Then when enabled,
it can be updated to target (4) instead.

- Implement a code patching system call in the kernel which takes care of all the
details associated with code patching that supports concurrent execution (breakpoint
bypass, or stopping target processes if required by the architecture). This system
call could check whether the target process has Runtime Integrity Check enforced,
and refuse code patching as needed.

As a nice side-effect, this could allow us to implement things like "alternative"
assembler instruction selection in user-space.

- Figure out a way to let a user-space process let the kernel know that it needs
to enforce Runtime Integrity Check. It could be either a prctl(), or perhaps a
clone flag if this needs to be known very early in the process lifetime.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com