Re: [BUG] Random intermittent boost failures (Was Re: [BUG] TREE04..)

Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> · Mon, 11 Sep 2023 12:18:16 -0400

On Mon, Sep 11, 2023 at 9:49 AM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Mon, Sep 11, 2023 at 01:17:30PM +0000, Joel Fernandes wrote:
> > On Mon, Sep 11, 2023 at 01:16:21AM -0700, Paul E. McKenney wrote:
> > > On Mon, Sep 11, 2023 at 02:27:25AM +0000, Joel Fernandes wrote:
> > > > On Sun, Sep 10, 2023 at 07:37:13PM -0400, Joel Fernandes wrote:
> > > > > On Sun, Sep 10, 2023 at 5:16 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Sun, Sep 10, 2023 at 08:14:45PM +0000, Joel Fernandes wrote:
> > > > > [...]
> > > > > > > >  I have been running into another intermittent one as well which
> > > > > > > > is the boost failure and that happens once in 10-15 runs or so.
> > > > > > > >
> > > > > > > > I was thinking of running the following configuration on an automated
> > > > > > > > regular basis to at least provide a better clue on the lucky run that
> > > > > > > > catches an issue. But then the issue is it would change timing enough
> > > > > > > > to maybe hide bugs. I could also make it submit logs automatically to
> > > > > > > > the list on such occurrences, but one step at a time and all that.  I
> > > > > > > > do need to add (hopefully less noisy) tick/timer related trace events.
> > > > > > > >
> > > > > > > > # Define the bootargs array
> > > > > > > > bootargs=(
[...]
> > > > > > > So some insight on this boost failure. Just before the boost failures are
> > > > > > > reported, I see the migration thread interferring with the rcu_preempt thread
> > > > > > > (aka GP kthread). See trace below. Of note is that the rcu_preempt thread is
> > > > > > > runnable while context switching, which means its execution is interferred.
> > > > > > > The rcu_preempt thread is at RT prio 2 as can be seen.
> > > > > > >
> > > > > > > So some open-ended questions: what exactly does the migration thread want,
> > > > > > > this is something related to CPU hotplug? And if the migration thread had to
> > > > > > > run, why did the rcu_preempt thread not get pushed to another CPU by the
> > > > > > > scheduler? We have 16 vCPUs for this test.
> > > > > >
> > > > > > Maybe we need a cpus_read_lock() before doing a given boost-test interval
> > > > > > and a cpus_read_unlock() after finishing one?  But much depends on
> > > > > > exactly what is starting those migration threads.
> > > > >
> > > > > But in the field, a real RT task can preempt a reader without doing
> > > > > cpus_read_lock() and may run into a similar boost issue?
> > >
> > > The sysctl_sched_rt_runtime should prevent a livelock in most
> > > configurations.  Here, rcutorture explicitly disables this.
> >
> > I see. Though RT throttling will actually stall the rcu_preempt thread as
> > well in the real world. RT throttling is a bit broken and we're trying to fix
> > it in scheduler land. Even if there are idle CPUs, RT throttling will starve
> > not just the offending RT task, but all of them essentially causing a
> > priority inversion between running RT and CFS tasks.
>
> Fair point.  But that requires that the offending runaway RT task hit both
> a reader and the grace-period kthread.  Keeping in mind that rcutorture
> is provisioning one runaway RT task per CPU, which in the real world is
> hopefully quite rare.  Hopefully.  ;-)

You are right, I exaggerated a bit. Indeed in the real world, RT
throttling can cause a prio inversion with CFS only if all other CPUs
are also RT throttled. Otherwise it tries to migrate the RT task to
another CPU. That's a very great point.

> Sounds like good progress!  Please let me know how it goes!!!

Thanks! Will do,

 - Joel