Alistair Delva <adelva@xxxxxxxxxx> writes: > On Mon, Nov 15, 2021 at 11:04 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote: >> >> On Mon, Nov 15, 2021 at 7:14 PM Alistair Delva <adelva@xxxxxxxxxx> wrote: >> > Booting to Android userspace on 5.14 or newer triggers the following >> > SELinux denial: >> > >> > avc: denied { sys_nice } for comm="init" capability=23 >> > scontext=u:r:init:s0 tcontext=u:r:init:s0 tclass=capability >> > permissive=0 >> > >> > Init is PID 0 running as root, so it already has CAP_SYS_ADMIN. For >> > better compatibility with older SEPolicy, check ADMIN before NICE. >> >> But with this patch you in turn punish the new/better policies that >> try to avoid giving domains CAP_SYS_ADMIN unless necessary (using only >> the more granular capabilities wherever possible), which may now get a >> bogus sys_admin denial. IMHO the order is better as it is, as it >> motivates the "good" policy writing behavior - i.e. spelling out the >> capability permissions more explicitly and avoiding CAP_SYS_ADMIN. >> >> IOW, if you domain does CAP_SYS_NICE things, and you didn't explicitly >> grant it that (and instead rely on the CAP_SYS_ADMIN fallback), then >> the denial correctly flags it as an issue in your policy and >> encourages you to add that sys_nice permission to the domain. Then >> when one beautiful hypothetical day the CAP_SYS_ADMIN fallback is >> removed, your policy will be ready for that and things will keep >> working. >> >> Feel free to carry that patch downstream if patching the kernel is >> easier for you than fixing the policy, but for the upstream kernel >> this is just a step in the wrong direction. > > I'm personally fine with this position, but I am curious why "never > break userspace" doesn't apply to SELinux policies. At the end of the > day, booting 5.13 or older, we don't get a denial, and there's nothing > for the sysadmin to do. On 5.14 and newer, we get denials. This is a > common pattern we see each year: some new capability or permission is > required where it wasn't required before, and there's no compatibility > mechanism to grandfather in old policies. So, we have to touch > userspace. If this is just how things are, I can certainly update our > init.te definitions. User space is not broken? If you just ignore this AVC denial then it will pass on cap_sys_admin. In other words everything still works, you only get a AVC denial for cap_sys_nice now. > >> > Fixes: 9d3a39a5f1e4 ("block: grant IOPRIO_CLASS_RT to CAP_SYS_NICE") >> > Signed-off-by: Alistair Delva <adelva@xxxxxxxxxx> >> > Cc: Khazhismel Kumykov <khazhy@xxxxxxxxxx> >> > Cc: Bart Van Assche <bvanassche@xxxxxxx> >> > Cc: Serge Hallyn <serge@xxxxxxxxxx> >> > Cc: Jens Axboe <axboe@xxxxxxxxx> >> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> >> > Cc: Paul Moore <paul@xxxxxxxxxxxxxx> >> > Cc: selinux@xxxxxxxxxxxxxxx >> > Cc: linux-security-module@xxxxxxxxxxxxxxx >> > Cc: kernel-team@xxxxxxxxxxx >> > Cc: stable@xxxxxxxxxxxxxxx # v5.14+ >> > --- >> > block/ioprio.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/block/ioprio.c b/block/ioprio.c >> > index 0e4ff245f2bf..4d59c559e057 100644 >> > --- a/block/ioprio.c >> > +++ b/block/ioprio.c >> > @@ -69,7 +69,7 @@ int ioprio_check_cap(int ioprio) >> > >> > switch (class) { >> > case IOPRIO_CLASS_RT: >> > - if (!capable(CAP_SYS_NICE) && !capable(CAP_SYS_ADMIN)) >> > + if (!capable(CAP_SYS_ADMIN) && !capable(CAP_SYS_NICE)) >> > return -EPERM; >> > fallthrough; >> > /* rt has prio field too */ >> > -- >> > 2.34.0.rc1.387.gb447b232ab-goog >> > >> >> -- >> Ondrej Mosnacek >> Software Engineer, Linux Security - SELinux kernel >> Red Hat, Inc. >> -- gpg --locate-keys dominick.grift@xxxxxxxxxxx Key fingerprint = FCD2 3660 5D6B 9D27 7FC6 E0FF DA7E 521F 10F6 4098 Dominick Grift