Re: Find root of the stall: was: Re: [PATCH 2/3] livepatch: Avoid blocking tasklist_lock too long

Yafang Shao <laoar.shao@xxxxxxxxx> · Tue, 18 Feb 2025 10:19:30 +0800

On Fri, Feb 14, 2025 at 7:37 PM Petr Mladek <pmladek@xxxxxxxx> wrote:
>
> On Fri 2025-02-14 00:36:03, Josh Poimboeuf wrote:
> > On Fri, Feb 14, 2025 at 10:44:59AM +0800, Yafang Shao wrote:
> > > The longest duration of klp_try_complete_transition() ranges from 8.5
> > > to 17.2 seconds.
> > >
> > > It appears that the RCU stall is not only driven by num_processes *
> > > average_klp_try_switch_task, but also by contention within
> > > klp_try_complete_transition(), particularly around the tasklist_lock.
> > > Interestingly, even after replacing "read_lock(&tasklist_lock)" with
> > > "rcu_read_lock()", the RCU stall persists. My verification shows that
> > > the only way to prevent the stall is by checking need_resched() during
> > > each iteration of the loop.
> >
> > I'm confused... rcu_read_lock() shouldn't cause any contention, right?
> > So if klp_try_switch_task() isn't the problem, then what is?
>
> I agree that it does not make much sense.

I'm confused too and trying to understand it better.

>
> > I wonder if those function timings might be misleading.  If
> > klp_try_complete_transition() gets preempted immediately when it
> > releases the lock, it could take a while before it eventually returns.
> > So that funclatency might not be telling the whole story.
>
> The scheduling might be an explanation.
>
> > Though 8.5 - 17.2 seconds is a bit excessive...
>
> If klp_try_complete_transition() scheduled out and we see this delay
> then the system likely had a pretty high load at the moment.
> Is it possible?

It appears to be workload-related. The RCU warning occurred at
specific time periods, likely due to certain workloads running at
those times, though I haven't confirmed it yet.

>
> Yafang, just to be sure. Have you seen these numbers with
> the original klp_try_complete_transition() code and with debug
> messages disabled?

Right. These RCU warnings appeared on our production servers without
any debugging enabled, and klp_try_complete_transition() hasn't
changed either.

>
> Or did you saw them with some extra debugging code or other
> modifications?

No, these are the default production settings as they originally were.

>
> Also just to be sure. Is this on bare metal?

Yes.

>
> Finally, what preemption mode are you using? Which CONFIG_PREEMPT*?

The preemption configuration is as follows:

CONFIG_PREEMPT_BUILD=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
CONFIG_PREEMPTION=y
CONFIG_PREEMPT_DYNAMIC=y
CONFIG_PREEMPT_RCU=y
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_PREEMPT_NOTIFIERS=y
# CONFIG_DEBUG_PREEMPT is not set
CONFIG_PREEMPTIRQ_TRACEPOINTS=y
# CONFIG_PREEMPT_TRACER is not set
# CONFIG_PREEMPTIRQ_DELAY_TEST is not set

> PS: JFYI, I have vacation the following week and won't have
>    access to mails...

Enjoy your holiday

--
Regards

Yafang