Re: [RFC] security: replace indirect calls with static calls

Casey Schaufler <casey@xxxxxxxxxxxxxxxx> · Mon, 24 Aug 2020 10:54:26 -0700

On 8/24/2020 10:04 AM, Brendan Jackman wrote:
> On Mon, 24 Aug 2020 at 18:43, Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote:
>> On 8/24/2020 8:20 AM, Brendan Jackman wrote:
>>> On Fri, 21 Aug 2020 at 00:46, Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote:
>>>> On 8/20/2020 9:47 AM, Brendan Jackman wrote:
>>> [...]
>>>> What does NOP really look like?
>>> The NOP is the same as a regular function call but the CALL
>>> instruction is replaced with a NOP instruction. The code that sets up
>>> the call parameters is unchanged, and so is the code that expects to
>>> get the return value in eax or whatever.
>> Right. Are you saying that NOP is in-line assembler in your switch?
> That's right - although it's behind the static_call API that the patch
> depends on ([5] in the original mail).
>
>>> That means we cannot actually
>>> call the static_calls for NULL slots, we'd get undefined behaviour
>>> (except for void hooks) - this is what Peter is talking about in the
>>> sibling thread.
>> Referring to the "sibling thread" is kinda confusing, and
>> assumes everyone is one all the right mailing lists, and knows
>> which other thread you're talking about.
> Sure, sorry - here's the Lore link for future reference:
>
> https://lore.kernel.org/lkml/20200820164753.3256899-1-jackmanb@xxxxxxxxxxxx/T/#m5a6fb3f10141049ce43e18a41f154796090ae1d5
>
>>> For this reason, there are _no gaps_ in the callback table. For a
>>> given LSM hook, all the slots after base_slot_idx are filled,
>> Why go to all the trouble of maintaining the base_slot_idx
>> if NOP is so cheap? Why not fill all unused slots with NOP?
>> Worst case would be a hook with no users, in which case you
>> have 11 NOPS in the void hook case and 11 "if (ret != DEFAULT_RET)"
>> and 11 NOPS in the int case. No switch magic required. Even
>> better, in the int case you have two calls/slot, the first is the
>> module supplied function (or NOP) and the second is
>>         int isit(int ret) { return (ret != DEFAULT_RET) ? ret : 0; }
>> (or NOP).
>>
>> The no security module case degenerates to 22 NOP instructions
>> and no if checks of any sort. I'm not the performance guy, but
>> that seems better than maintaining and checking base_slot_idx
>> to me.
> The switch trick is not really motivated by performance.

Then what is its motivation? It makes the code more complicated,
and is unnecessary.

> I think all the focus on the NOPs themselves is a bit misleading here
> - we _can't_ execute the NOPs for the int hooks, because there are
> instructions after them that expect a function to have just returned a
> value, which NOP doesn't do.

That's what I was hoping to address with the second call in the slot.
The first call in the slot would be either the module supplied code
or a NOP if the module isn't using the hook. The second would be
supplied by the LSM infrastructure and would be NOP if the module
didn't use the hook. The LSM supplied function would do the necessary
checking. Its more complicated than the void case, but not that much
more complicated than the existing list based scheme.

The concern about the non-existent return on a NOP can be dealt with
by setting up initial conditions correctly in most cases. Dealing with
multiple security modules providing information (e.g. secid_to_secctx)
is where it gets tricky.

>  When there is a NOP in the slot instead
> of a CALL, it would appear to "return" whatever value is leftover in
> the return register. At the C level, this is why the static_call API
> doesn't allow static_call_cond to return a value (which is what PeterZ
> is referring to in the thread I linked above).
>
> So, we could drop the switch trick for void hooks and just use
> static_call_cond, but this doesn't work for int hooks. IMO that
> variation between the two hook types would just add confusion.

With the number of cases where the switch trick isn't going to
work in the long term I'm disinclined to think it makes things
less confusing.

>>>>> +#define __UNROLL_MACRO_LOOP_20(MACRO, ...) \
>>>>> + __UNROLL_MACRO_LOOP_19(MACRO, __VA_ARGS__) \
>>>>> + MACRO(19, __VA_ARGS__)
>>>>> +
>>>> Where does "20" come from? Why are you unrolling beyond 11?
>>> It's just an arbitrary limit on the unrolling macro implementation, we
>>> aren't actually unrolling beyond 11 where the macro is used (N is set
>>> to 11).
>> I'm not a fan of including macros you can't use, especially
>> when they're just obvious variants of other macros.
> Not sure what you mean here - is there already a macro that does what
> UNROLL_MACRO_LOOP does?

No, I'm saying that __UNROLL_MACRO_LOOP_20() will never be used on
a system that has at most 11 security modules. You've added a bunch of
text that serves no purpose. "Future expansion" is pretty silly here.