On 7/3/2024 4:08 PM, KP Singh wrote: > On Thu, Jul 4, 2024 at 12:52 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >> On Wed, Jul 3, 2024 at 6:22 PM KP Singh <kpsingh@xxxxxxxxxx> wrote: >>> On Wed, Jul 3, 2024 at 10:56 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >>>> On Wed, Jul 3, 2024 at 12:55 PM KP Singh <kpsingh@xxxxxxxxxx> wrote: >>>>> On Wed, Jul 3, 2024 at 2:07 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: >>>>>> On Jun 29, 2024 KP Singh <kpsingh@xxxxxxxxxx> wrote: >>>>>>> LSM hooks are currently invoked from a linked list as indirect calls >>>>>>> which are invoked using retpolines as a mitigation for speculative >>>>>>> attacks (Branch History / Target injection) and add extra overhead which >>>>>>> is especially bad in kernel hot paths: >>>>> [...] >>>>> >>>>>> should fix the more obvious problems. I'd like to know if you are >>>>>> aware of any others? If not, the text above should be adjusted and >>>>>> we should reconsider patch 5/5. If there are other problems I'd >>>>>> like to better understand them as there may be an independent >>>>>> solution for those particular problems. >>>>> We did have problems with some other hooks but I was unable to dig up >>>>> specific examples though, it's been a while. More broadly speaking, a >>>>> default decision is still a decision. Whereas the intent from the BPF >>>>> LSM is not to make a default decision unless a BPF program is loaded. >>>>> I am quite worried about the holes this leaves open, subtle bugs >>>>> (security or crashes) we have not caught yet and PATCH 5/5 engineers away >>>>> the problem of the "default decision". >>>> The inode/xattr problem you originally mentioned wasn't really rooted >>>> in a "bad" default return value, it was really an issue with how the >>>> LSM hook was structured due to some legacy design assumptions made >>>> well before the initial stacking patches were merged. That should be >>>> fixed now[1] and given that the inode/xattr set/remove hooks were >>>> unique in this regard (individual LSMs were responsible for performing >>>> the capabilities checks) I don't expect this to be a general problem. >>>> >>>> There were also some issues caused by the fact that we were defining >>>> the default return value in multiple places and these values had gone >>>> out of sync in a number of hooks. We've also fixed this problem by >>>> only defining the default return value once for each hook, solving all >>>> of those problems. >>> I don't see how this solves problems or prevents any future problems >>> with side-effects. I have always been uncomfortable with an extraneous >>> function being called with a side effect ever since we merged BPF LSM >>> with default callback. We have found one bug due to this, not all the >>> bugs. >> You've got to give me something more concrete than that. If you can't >> provide any concrete examples, start with providing a basic concept >> with far more detail than just "side-effects". >> >>>> I'm not aware of any other existing problems relating to the LSM hook >>>> default values, if there are any, we need to fix them independent of >>>> this patchset. The LSM framework should function properly if the >>>> "default" values are used. >>> Patch 5 eliminates the possibilities of errors and subtle bugs all >>> together. The problem with subtle bugs is, well, they are subtle, if >>> you and I knew of the bugs, we would fix all of them, but we don't. I >>> really feel we ought to eliminate the class of issues and not just >>> whack-a-mole when we see the bugs. >> Here's the thing, I don't really like patch 5/5. To be honest, I >> don't really like a lot of this patchset. From my perspective, the >> complexity of the code is likely going to mean more maintenance >> headaches down the road, but Linus hath spoken so we're doing this >> (although "this" is still a bit undefined as far as I'm concerned). >> If you want me to merge patch 5/5 you've got to give me something real >> and convincing that can't be fixed by any other means. My current >> opinion is that you're trying to use a previously fixed bug to scare >> and/or coerce the merging of some changes I don't really want to >> merge. If you want me to take patch 5/5, you've got to give me a >> reason that is far more compelling that what you've written thus far. > Paul, I am not scaring you, I am providing a solution that saves us > from headaches with side-effects and bugs in the future. It's safer by > design. > > You say you have not reviewed it carefully, but you did ask me to move > the function from the BPF LSM layer to an LSM API, and we had a bunch > of discussion around naming in the subsequent revisions. > > https://lore.kernel.org/bpf/f7e8a16b0815d9d901e019934d684c5f@xxxxxxxxxxxxxx/ > > My reasons are: > > 1. It's safer, no side effects, guaranteed to be not buggy. Neither > you, nor me, can guarantee that a default value will be safe in the > LSM layer. I request others (Casey, Kees) for their opinion here too. > 2. Performance, no extra function call. I want to be very careful about the comments I make on this patch set. I can't say that I trust any fix for the BPF LSM layer. My natural inclination is to isolate the fix to the area that has the problem, that is, BPF. I have a hard time accepting the notion that the implementation will really fix all possible bugs in the future. The pace at which eBPF is evolving gives me the heebee geebees when I think of it as a mechanism for implementing security modules. My biggest concern is that we may be trying too hard for perfection. I see a situation where we're not moving forward because there are two reasonable solutions and rather than running with either we're desperately looking for a compelling reason to pick one over the other. > > If you still don't like it, it's your call, I would still like to keep > most of the logic local to the BPF LSM as proposed in > https://lore.kernel.org/bpf/f7e8a16b0815d9d901e019934d684c5f@xxxxxxxxxxxxxx/ > > - KP > >> -- >> paul-moore.com