Re: [PATCH bpf-next 0/4] Reduce overhead of LSMs with static calls

Casey Schaufler <casey@xxxxxxxxxxxxxxxx> · Mon, 13 Feb 2023 10:04:03 -0800

On 2/12/2023 2:00 PM, Paul Moore wrote:
> On Fri, Feb 10, 2023 at 9:32 PM Casey Schaufler <casey@xxxxxxxxxxxxxxxx> wrote:
>> On 2/10/2023 12:03 PM, Paul Moore wrote:
>>> On Thu, Feb 9, 2023 at 11:56 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>>>> On Fri, Jan 27, 2023 at 03:16:38PM -0500, Paul Moore wrote:
>>>>> On Thu, Jan 19, 2023 at 6:10 PM KP Singh <kpsingh@xxxxxxxxxx> wrote:
>>>>>> # Background
>>>>>>
>>>>>> LSM hooks (callbacks) are currently invoked as indirect function calls. These
>>>>>> callbacks are registered into a linked list at boot time as the order of the
>>>>>> LSMs can be configured on the kernel command line with the "lsm=" command line
>>>>>> parameter.
>>>>> Thanks for sending this KP.  I had hoped to make a proper pass through
>>>>> this patchset this week but I ended up getting stuck trying to wrap my
>>>>> head around some network segmentation offload issues and didn't quite
>>>>> make it here.  Rest assured it is still in my review queue.
>>>>>
>>>>> However, I did manage to take a quick look at the patches and one of
>>>>> the first things that jumped out at me is it *looks* like this
>>>>> patchset is attempting two things: fix a problem where one LSM could
>>>>> trample another (especially problematic with the BPF LSM due to its
>>>>> nature), and reduce the overhead of making LSM calls.  I realize that
>>>>> in this patchset the fix and the optimization are heavily
>>>>> intermingled, but I wonder what it would take to develop a standalone
>>>>> fix using the existing indirect call approach?  I'm guessing that is
>>>>> going to potentially be a pretty significant patch, but if we could
>>>>> add a little standardization to the LSM hooks without adding too much
>>>>> in the way of code complexity or execution overhead I think that might
>>>>> be a win independent of any changes to how we call the hooks.
>>>>>
>>>>> Of course this could be crazy too, but I'm the guy who has to ask
>>>>> these questions :)
>>>> Hm, I am expecting this patch series to _not_ change any semantics of
>>>> the LSM "stack". I would agree: nothing should change in this series, as
>>>> it should be strictly a mechanical change from "iterate a list of
>>>> indirect calls" to "make a series of direct calls". Perhaps I missed
>>>> a logical change?
>>> I might be missing something too, but I'm thinking of patch 4/4 in
>>> this series that starts with this sentence:
>>>
>>>  "BPF LSM hooks have side-effects (even when a default value is
>>>   returned), as some hooks end up behaving differently due to
>>>   the very presence of the hook."
>> My understanding of the current "agreement" is that we keep BPF
>> hooks at the end for this very reason.
> It would be nice to not have these conventions.  I get that it's the
> only knob we have at the moment to tweak, but I would hope that we
> could do better in the future.

Agreed. I don't care much for it.

>
> Ignoring the static call changes for a moment, I'm curious what it
> would look like to have a better mechanism for handling things like
> this.  What would it look like if we expanded the individual LSM error
> reporting back to the LSM layer to have a bit more information, e.g.
> "this LSM erred, but it is safe to continue evaluating other LSMs" and
> "this LSM erred, and it was too severe to continue evaluating other
> LSMs"?  Similarly, would we want to expand the hook registration to
> have more info, e.g. "run this hook even when other LSMs have failed"
> and "if other LSMs have failed, do not run this hook"?
>> I really don't want another LSM to have sway over Smack enforcement.
> I think we can all agree that the one LSM should not have the ability
> to affect the operation of another, especially when it would cause the
> violation of a different LSM's security policy.
>
>> I would hate to see, for example, an LSM decide that because it has
>> initialized an inode no other LSM should be allowed to, even in an
>> error situation. There are really only two options Call all the hooks
>> every time and either succeed on all or report the most important
>> error. Or, "bail on fail", and acknowledge that following hooks may
>> not be called. Really, does "I failed, but it's not that important"
>> make sense as a return value?
> Of the two things I tossed out, richer return values and richer hook
> registration, perhaps it's really only the latter, richer hook
> registration that is important here.  It would allow a LSM to indicate
> to the LSM hook layer how the individual hook implementation should be
> called: always, or only if previously called implementations have not
> failed.  I believe that should eliminate any worry of a BPF LSM, or
> any LSM for that matter, from impacting the security policy of
> another.  However, I will admit that I haven't spent the necessary
> amount of time chasing down all the hooks to verify if that is 100%
> correct.
>
>>> I realize that loading a BPF LSM is a privileged operation so we've
>>> largely mitigated the risk there, but with stacking on it's way to
>>> being more full featured, and IMA slowly working its way to proper LSM
>>> status, it seems to me like having a richer, and proper way to handle
>>> individual LSM failures would be a good thing.  I feel like patch 4/4
>>> definitely hints at this, but I could be mistaken.
>> We have bigger issues with BPF. There's nothing to prevent BPF from
>> implementing a secid_to_secctx() hook and making a system with SELinux
>> go cattywampus. BPF is stacked as if it isn't a "major" LSM, while
>> allowing it to do "major" LSM things. One reason we need full stacking
>> is to address this.
> That's a different issue, and one of the reasons why I suggested
> taking an all-or-nothing approach to stacking many years ago, but ...
> well, you know how that worked out.  I promise to not keep saying "I
> told you so" if you promise to not keep bringing up LSM stacking as
> the answer to all that ails you ;)
>