On 12/9/2019 5:58 AM, Stephen Smalley wrote: > On 12/9/19 8:21 AM, Stephen Smalley wrote: >> On 12/9/19 2:57 AM, Ondrej Mosnacek wrote: >>> Commit b1d9e6b0646d ("LSM: Switch to lists of hooks") switched the LSM >>> infrastructure to use per-hook lists, which meant that removing the >>> hooks for a given module was no longer atomic. Even though the commit >>> clearly documents that modules implementing runtime revmoval of hooks >>> (only SELinux attempts this madness) need to take special precautions to >>> avoid race conditions, SELinux has never addressed this. >>> >>> By inserting an artificial delay between the loop iterations of >>> security_delete_hooks() (I used 100 ms), booting to a state where >>> SELinux is enabled, but policy is not yet loaded, and running these >>> commands: >>> >>> while true; do ping -c 1 <some IP>; done & >>> echo -n 1 >/sys/fs/selinux/disable >>> kill %1 >>> wait >>> >>> ...I was able to trigger NULL pointer dereferences in various places. I >>> also have a report of someone getting panics on a stock RHEL-8 kernel >>> after setting SELINUX=disabled in /etc/selinux/config and rebooting >>> (without adding "selinux=0" to kernel command-line). >>> >>> Reordering the SELinux hooks such that those that allocate structures >>> are removed last seems to prevent these panics. It is very much possible >>> that this doesn't make the runtime disable completely race-free, but at >>> least it makes the operation much less fragile. >>> >>> An alternative (and safer) solution would be to add NULL checks to each >>> hook, but doing this just to support the runtime disable hack doesn't >>> seem to be worth the effort... >> >> Personally, I would prefer to just get rid of runtime disable altogether; it also precludes making the hooks read-only after initialization. IMHO, selinux=0 is the proper way to disable SELinux if necessary. I believe there is an open bugzilla on Fedora related to this issue, since runtime disable was originally introduced for Fedora. > > Also, if we have to retain this support, it seems like this ought to be fixed in the LSM framework especially since it was a change there that broke the SELinux implementation. Agreed, mostly. Deleting an LSM is fundamentally something the infrastructure should handle *if* we allow it. Should we decide at some point to allow loadable modules, as Tetsuo has advocated from time to time, we would need a general solution. We don't have a general solution now because only SELinux wants it. The previous implementation was under #ifdef for SELinux. At the time I understood that there was no interest in investing in it. The implementation passed tests at the time. I propose that until such time as someone decides to seriously investigate loadable security modules* the sole user of the deletion mechanism is welcome to invest whatever they like in their special case, and I will be happy to lend whatever assistance I can. --- * I do not plan to propose an implementation of loadable modules. I leave that as an exercise for the next generation.