On Thu, Mar 18, 2021 at 1:45 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > On Thu, Mar 18, 2021 at 1:44 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote: > > On Thu, Mar 18, 2021 at 12:57 PM Serhei Makarov <smakarov@xxxxxxxxxx> wrote: > > > On Thu, Mar 18, 2021 at 10:43 AM Serhei Makarov <smakarov@xxxxxxxxxx> wrote: > > > > Jiri Olsa also reports seeing a similar deadlock at v5.10. I'm in the > > > > middle of double-checking my bisection which ended up at a > > > > seemingly-unrelated commit [2] > > > > > > > > [1] https://bugzilla.redhat.com/show_bug.cgi?id=1938312 > > > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=2dcb3964544177c51853a210b6ad400de78ef17d > > > > > > I've confirmed that my first bisection was incorrect by testing > > > @1c2f67308af4 mm: thp: fix MADV_REMOVE deadlock on shmem THP > > > and reproducing the deadlock. Previously this commit was marked as > > > good, so it seems a kernel with the bug can sometimes pass the test. > > > > > > I'll double check rc6 next since I have the kernel handy. If > > > 5.11.0-rc6 can also be made to fail, with Jiri Olsa's report it'd be > > > necessary to do a wider search. > > > There may be commits with intent similar to > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d92db5c04d103 > > > which tightened some of the behaviour of kernel reads, but affecting > > > the audit subsystem? > > > The actual stack trace that leads to deadlock goes through > > > security_locked_down() which was present since the original patch > > > reworking probe_read into separate probe_read_{user,kernel} helpers > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v5.11-rc7&id=6ae08ae3dea2 > > > > Added thee SELinux list to the To/CC line; they should really be > > involved. I'm also CC'ing the LSM list for good measure as there may > > be other people that care about this. > > Argh, hit send a bit too quickly :/ > > > FYI, the first instance of this thread that I saw can be found here > > via the linux-audit list: > > > > https://lore.kernel.org/linux-audit/CANYvDQN7H5tVp47fbYcRasv4XF07eUbsDwT_eDCHXJUj43J7jQ@xxxxxxxxxxxxxx/ Previously in the thread there was a question about why audit events are being generated inside bpf_probe_read_compat(); the answer is pretty simple, we do an access check in the security_locked_down() hook, inside the call to bpf_probe_read_kernel_common(), and that can result in an audit event depending on the LSM and it's policy. Skipping the audit event in the case of a LSM access denial, e.g. a SELinux AVC denial, could result in a silent access denial which can be maddening both to users and admins. -- paul moore www.paul-moore.com