Re: [PATCH v1 0/4] [RFC] Implement Trampoline File Descriptor

"Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx> · Tue, 28 Jul 2020 11:32:32 -0500

Thanks. See inline..

On 7/28/20 10:13 AM, David Laight wrote:
> From:  madvenka@xxxxxxxxxxxxxxxxxxx
>> Sent: 28 July 2020 14:11
> ...
>> The kernel creates the trampoline mapping without any permissions. When
>> the trampoline is executed by user code, a page fault happens and the
>> kernel gets control. The kernel recognizes that this is a trampoline
>> invocation. It sets up the user registers based on the specified
>> register context, and/or pushes values on the user stack based on the
>> specified stack context, and sets the user PC to the requested target
>> PC. When the kernel returns, execution continues at the target PC.
>> So, the kernel does the work of the trampoline on behalf of the
>> application.
> Isn't the performance of this going to be horrid?

It takes about the same amount of time as getpid(). So, it is
one quick trip into the kernel. I expect that applications will
typically not care about this extra overhead as long as
they are able to run.

But I agree that if there is an application that cannot tolerate
this extra overhead, then it is an issue. See below for further
discussion.

In the libffi changes I have included in the cover letter, I have
done it in such a way that trampfd is chosen when current
security settings don't allow other methods such as
loading trampoline code into a file and mapping it. In this
case, the application can at least run with trampfd.

>
> If you don't care that much about performance the fixup can
> all be done in userspace within the fault signal handler.

I do care about performance.

This is a framework to address trampolines. In this initial
work, I want to establish one basic way for things to work.
In the future, trampfd can be enhanced for performance.
For instance, it is easy for an architecture to generate
the exact instructions required to load specified registers,
push specified values on the stack and jump to a target
PC. The kernel can map a page with the generated code
with execute permissions. In this case, the performance
issue goes away.
> Since whatever you do needs the application changed why
> not change the implementation of nested functions to not
> need on-stack executable trampolines.

I kinda agree with your suggestion.

But it is up to the GCC folks to change its implementation.
I am trying to provide a way for their existing implementation
to work in a more secure way.
> I can think of other alternatives that don't need much more
> than an array of 'push constant; jump trampoline' instructions
> be created (all jump to the same place).
>
> You might want something to create an executable page of such
> instructions.

Agreed. And that can be done within this framework as
I have mentioned above.

But it is not just this trampoline type that I have implemented
in this patchset. In the future, other types can be implemented
and other contexts can be defined. Basically, the approach is
for the user to supply a recipe to the kernel and leave it up to
the kernel to do it in the best way possible. I am hoping that
other forms of dynamic code can be addressed in the future
using the same framework.

*Purely as a hypothetical example*, a user can supply
instructions in a language such as BPF that the kernel
understands and have the kernel arrange for that to
be executed in user context.

Madhavan

> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)