On Wed, Jul 3, 2024 at 10:56 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > On Wed, Jul 3, 2024 at 12:55 PM KP Singh <kpsingh@xxxxxxxxxx> wrote: > > On Wed, Jul 3, 2024 at 2:07 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > > > > > On Jun 29, 2024 KP Singh <kpsingh@xxxxxxxxxx> wrote: > > > > > > > > LSM hooks are currently invoked from a linked list as indirect calls > > > > which are invoked using retpolines as a mitigation for speculative > > > > attacks (Branch History / Target injection) and add extra overhead which > > > > is especially bad in kernel hot paths: > > > > [...] > > > > > should fix the more obvious problems. I'd like to know if you are > > > aware of any others? If not, the text above should be adjusted and > > > we should reconsider patch 5/5. If there are other problems I'd > > > like to better understand them as there may be an independent > > > solution for those particular problems. > > > > We did have problems with some other hooks but I was unable to dig up > > specific examples though, it's been a while. More broadly speaking, a > > default decision is still a decision. Whereas the intent from the BPF > > LSM is not to make a default decision unless a BPF program is loaded. > > I am quite worried about the holes this leaves open, subtle bugs > > (security or crashes) we have not caught yet and PATCH 5/5 engineers away > > the problem of the "default decision". > > The inode/xattr problem you originally mentioned wasn't really rooted > in a "bad" default return value, it was really an issue with how the > LSM hook was structured due to some legacy design assumptions made > well before the initial stacking patches were merged. That should be > fixed now[1] and given that the inode/xattr set/remove hooks were > unique in this regard (individual LSMs were responsible for performing > the capabilities checks) I don't expect this to be a general problem. > > There were also some issues caused by the fact that we were defining > the default return value in multiple places and these values had gone > out of sync in a number of hooks. We've also fixed this problem by > only defining the default return value once for each hook, solving all > of those problems. I don't see how this solves problems or prevents any future problems with side-effects. I have always been uncomfortable with an extraneous function being called with a side effect ever since we merged BPF LSM with default callback. We have found one bug due to this, not all the bugs. > > I'm not aware of any other existing problems relating to the LSM hook > default values, if there are any, we need to fix them independent of > this patchset. The LSM framework should function properly if the > "default" values are used. Patch 5 eliminates the possibilities of errors and subtle bugs all together. The problem with subtle bugs is, well, they are subtle, if you and I knew of the bugs, we would fix all of them, but we don't. I really feel we ought to eliminate the class of issues and not just whack-a-mole when we see the bugs. The LSM framework ought to function with default values is a nice goal, but a weaker position than what we have where we make these subtle bugs impossible. If you feel this should not be a part of the framework, I would still like to revert back to implementation where we kept the toggling logic contained to the BPF LSM. - KP > > [1] I just realized that commit 61df7b828204 doesn't properly update > the removexattr implementations for SELinux and Smack, expect a patch > for that soon. The current code is okay, it just does some > unnecessary work (see the setxattr changes to get an idea of the > changes needed). > > > > > +#define lsm_for_each_hook(scall, NAME) \ > > > > + for (scall = static_calls_table.NAME; \ > > > > + scall - static_calls_table.NAME < MAX_LSM_COUNT; scall++) \ > > > > + if (static_key_enabled(&scall->active->key)) > > > > > > This is probably a stupid question, but why use static_key_enabled() > > > here instead of static_branch_unlikely() as in the call_XXX_macros? > > > > The static_key_enabled is a check for the key being enabled, whereas > > the static_branch_likely is for laying out the right assembly code > > (jump tables etc.) and based on the value of the static key, here we > > are not using the static calls or even keys, rather we are following > > back from direct calls to indirect calls and don't really need any > > jump tables in the slow path. > > Gotcha, thanks for the explanation. > > -- > paul-moore.com