On Mon, 24 Aug 2020 at 18:43, Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote: > > On 8/24/2020 8:20 AM, Brendan Jackman wrote: > > On Fri, 21 Aug 2020 at 00:46, Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote: > >> On 8/20/2020 9:47 AM, Brendan Jackman wrote: > > [...] > >> What does NOP really look like? > > The NOP is the same as a regular function call but the CALL > > instruction is replaced with a NOP instruction. The code that sets up > > the call parameters is unchanged, and so is the code that expects to > > get the return value in eax or whatever. > > Right. Are you saying that NOP is in-line assembler in your switch? That's right - although it's behind the static_call API that the patch depends on ([5] in the original mail). > > That means we cannot actually > > call the static_calls for NULL slots, we'd get undefined behaviour > > (except for void hooks) - this is what Peter is talking about in the > > sibling thread. > > Referring to the "sibling thread" is kinda confusing, and > assumes everyone is one all the right mailing lists, and knows > which other thread you're talking about. Sure, sorry - here's the Lore link for future reference: https://lore.kernel.org/lkml/20200820164753.3256899-1-jackmanb@xxxxxxxxxxxx/T/#m5a6fb3f10141049ce43e18a41f154796090ae1d5 > > > > For this reason, there are _no gaps_ in the callback table. For a > > given LSM hook, all the slots after base_slot_idx are filled, > > Why go to all the trouble of maintaining the base_slot_idx > if NOP is so cheap? Why not fill all unused slots with NOP? > Worst case would be a hook with no users, in which case you > have 11 NOPS in the void hook case and 11 "if (ret != DEFAULT_RET)" > and 11 NOPS in the int case. No switch magic required. Even > better, in the int case you have two calls/slot, the first is the > module supplied function (or NOP) and the second is > int isit(int ret) { return (ret != DEFAULT_RET) ? ret : 0; } > (or NOP). > > The no security module case degenerates to 22 NOP instructions > and no if checks of any sort. I'm not the performance guy, but > that seems better than maintaining and checking base_slot_idx > to me. The switch trick is not really motivated by performance. I think all the focus on the NOPs themselves is a bit misleading here - we _can't_ execute the NOPs for the int hooks, because there are instructions after them that expect a function to have just returned a value, which NOP doesn't do. When there is a NOP in the slot instead of a CALL, it would appear to "return" whatever value is leftover in the return register. At the C level, this is why the static_call API doesn't allow static_call_cond to return a value (which is what PeterZ is referring to in the thread I linked above). So, we could drop the switch trick for void hooks and just use static_call_cond, but this doesn't work for int hooks. IMO that variation between the two hook types would just add confusion. > >>> +#define __UNROLL_MACRO_LOOP_20(MACRO, ...) \ > >>> + __UNROLL_MACRO_LOOP_19(MACRO, __VA_ARGS__) \ > >>> + MACRO(19, __VA_ARGS__) > >>> + > >> Where does "20" come from? Why are you unrolling beyond 11? > > It's just an arbitrary limit on the unrolling macro implementation, we > > aren't actually unrolling beyond 11 where the macro is used (N is set > > to 11). > > I'm not a fan of including macros you can't use, especially > when they're just obvious variants of other macros. Not sure what you mean here - is there already a macro that does what UNROLL_MACRO_LOOP does?