Re: Repeatable hard lockup running strace testsuite on 4.19.98+ onwards

Jann Horn <jannh@xxxxxxxxxx> · Fri, 26 Jun 2020 16:25:59 +0200

On Fri, Jun 26, 2020 at 3:41 PM Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Fri, Jun 26, 2020 at 12:35:58PM +0100, Steve McIntyre wrote:
> > I'm the maintainer in Debian for strace. Trying to reproduce
> > https://bugs.debian.org/963462 on my machine (Thinkpad T470), I've
> > found a repeatable hard lockup running the strace testsuite. Each time
> > it seems to have failed in a slightly different place in the testsuite
> > (suggesting it's not one particular syscall test that's triggering the
> > failure). I initially found this using Debian's current Buster kernel
> > (4.19.118+2+deb10u1), then backtracking I found that 4.19.98+1+deb10u1
> > worked fine.
> >
> > I've bisected to find the failure point along the linux-4.19.y stable
> > branch and what I've got to is the following commit:
> >
> > e58f543fc7c0926f31a49619c1a3648e49e8d233 is the first bad commit
> > commit e58f543fc7c0926f31a49619c1a3648e49e8d233
> > Author: Jann Horn <jannh@xxxxxxxxxx>
> > Date:   Thu Sep 13 18:12:09 2018 +0200
> >
> >     apparmor: don't try to replace stale label in ptrace access check
> >
> >     [ Upstream commit 1f8266ff58840d698a1e96d2274189de1bdf7969 ]
> >
> >     As a comment above begin_current_label_crit_section() explains,
> >     begin_current_label_crit_section() must run in sleepable context because
> >     when label_is_stale() is true, aa_replace_current_label() runs, which uses
> >     prepare_creds(), which can sleep.
> >     Until now, the ptrace access check (which runs with a task lock held)
> >     violated this rule.
> >
> >     Also add a might_sleep() assertion to begin_current_label_crit_section(),
> >     because asserts are less likely to be ignored than comments.
> >
> >     Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels")
> >     Signed-off-by: Jann Horn <jannh@xxxxxxxxxx>
> >     Signed-off-by: John Johansen <john.johansen@xxxxxxxxxxxxx>
> >     Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
> >
> > :040000 040000 ca92f885a38c1747b812116f19de6967084a647e 865a227665e460e159502f21e8a16e6fa590bf50 M security
> >
> > Considering I'm running strace build tests to provoke this bug,
> > finding the failure in a commit talking about ptrace changes does look
> > very suspicious...!
> >
> > Annoyingly, I can't reproduce this on my disparate other machines
> > here, suggesting it's maybe(?) timing related.

Does "hard lockup" mean that the HARDLOCKUP_DETECTOR infrastructure
prints a warning to dmesg? If so, can you share that warning?

If you don't have any way to see console output, and you don't have a
working serial console setup or such, you may want to try re-running
those tests while the kernel is booted with netconsole enabled to log
to a different machine over UDP (see
https://www.kernel.org/doc/Documentation/networking/netconsole.txt).

You may want to try setting the sysctl kernel.sysrq=1 , then when the
system has locked up, press ALT+PRINT+L (to generate stack traces for
all active CPUs from NMI context), and maybe also ALT+PRINT+T and
ALT+PRINT+W (to collect more information about active tasks).

(If you share stack traces from these things with us, it would be
helpful if you could run them through scripts/decode_stacktrace.pl
from the kernel tree first, to add line number information.)

Trying to isolate the problem:

__end_current_label_crit_section and end_current_label_crit_section
are aliases of each other (via #define), so that line change can't
have done anything.

That leaves two possibilities AFAICS:
 - the might_sleep() call by itself is causing issues for one of the
remaining users of begin_current_label_crit_section() (because it
causes preemption to happen more often where it didn't happen on
PREEMPT_VOLUNTARY before, or because it's trying to print a warning
message with the runqueue lock held, or something like that)
 - the lack of "if (aa_replace_current_label(label) == 0)
aa_put_label(label);" in __begin_current_label_crit_section() is
somehow causing issues

You could try to see whether just adding the might_sleep(), or just
replacing the begin_current_label_crit_section() call with
__begin_current_label_crit_section(), triggers the bug.

If you could recompile the kernel with CONFIG_DEBUG_ATOMIC_SLEEP - if
that isn't already set in your kernel config -, that might help track
down the problem, unless it magically makes the problem stop
triggering (which I guess would be conceivable if this indeed is a
race).