On 2/27/2023 5:14 PM, Linus Torvalds wrote: > On Wed, Jan 25, 2023 at 7:56 AM Mateusz Guzik <mjguzik@xxxxxxxxx> wrote: >> +static inline bool cap_isidentical(const kernel_cap_t a, const kernel_cap_t b) >> +{ >> + unsigned __capi; >> + CAP_FOR_EACH_U32(__capi) { >> + if (a.cap[__capi] != b.cap[__capi]) >> + return false; >> + } >> + return true; >> +} >> + > Side note, and this is not really related to this particular patch > other than because it just brought up the issue once more.. > > Our "kernel_cap_t" thing is disgusting. > > It's been a structure containing > > __u32 cap[_KERNEL_CAPABILITY_U32S]; > > basically forever, and it's not likely to change in the future. I > would object to any crazy capability expansion, considering how > useless and painful they've been anyway, and I don't think anybody > really is even remotely planning anything like that anyway. > > And what is _KERNEL_CAPABILITY_U32S anyway? It's the "third version" > of that size: > > #define _KERNEL_CAPABILITY_U32S _LINUX_CAPABILITY_U32S_3 > > which happens to be the same number as the second version of said > #define, which happens to be "2". > > In other words, that fancy array is just 64 bits. And we'd probably be > better off just treating it as such, and just doing > > typedef u64 kernel_cap_t; > > since we have to do the special "convert from user space format" > _anyway_, and this isn't something that is shared to user space as-is. > > Then that "cap_isidentical()" would literally be just "a == b" instead > of us playing games with for-loops that are just two wide, and a > compiler that may or may not realize. > > It would literally remove some of the insanity in <linux/capability.h> > - look for CAP_TO_MASK() and CAP_TO_INDEX and CAP_FS_MASK_B0 and > CAP_FS_MASK_B1 and just plain ugliness that comes from this entirely > historical oddity. > > Yes, yes, we started out having it be a single-word array, and yes, > the code is written to think that it might some day be expanded past > the two words it then in 2008 it expanded to two words and 64 bits. > And now, fifteen years later, we use 40 of those 64 bits, and > hopefully we'll never add another one. I agree that the addition of 24 more capabilities is unlikely. The two reasons presented recently for adding capabilities are to implement boutique policies (CAP_MYHARDWAREISSPECIAL) or to break up CAP_SYS_ADMIN. Neither of these is sustainable with a finite number of capabilities, nor do they fit the security model capabilities implement. It's possible that a small number of additional capabilities will be approved, but even that seems unlikely. > So we have historical reasons for why our kernel_cap_t is so odd. But > it *is* odd. > > Hmm? I don't see any reason that kernel_cap_t shouldn't be a u64. If by some amazing change in mindset we develop need for 65 capabilities, someone can dredge up the old code, shout "I told you so!" and put it back the way it was. Or maybe by then we'll have u128, and can just switch to that. > Linus