Re: [PATCHv2 3/3] rcu: coordinate tick dependency during concurrent offlining

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 02, 2022 at 05:08:58PM +0200, Frederic Weisbecker wrote:
> On Sun, Oct 02, 2022 at 09:29:59PM +0800, Pingfan Liu wrote:
> > On Fri, Sep 30, 2022 at 11:45 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> > >
> > [...]
> > > > > I have managed to grasp three two-socket machine, each has 256 cpus.
> > > > > The test has run about 7 hours till now without any problem by the following command:
> > > > > tools/testing/selftests/rcutorture/bin/kvm-remote.sh "sys1 sys2 sys3" \
> > > > > --duration 45h --cpus 256 --bootargs "rcutorture.onoff_interval=200 rcutorture.onoff_holdoff=30" --configs "96*TREE04"
> > > > >
> > > > > It seems promising.
> > > > >
> > > >
> > > > The test is against v6.0-rc7 kernel, and only with 96926686deab ("rcu:
> > > > Make CPU-hotplug removal operations enable tick") reverted. It is
> > > > close to the end, but unfortunately it fails.
> > > > Quote from remote-log
> > > > "
> > > > TREE04.57 ------- 4410955 GPs (27.2281/s) [rcu: g36045577 f0x0
> > > > total-gps=9011687] n_max_cbs: 4111392
> > > > TREE04.58 ------- 4368391 GPs (26.9654/s) [rcu: g35630093 f0x0
> > > > total-gps=8907816] n_max_cbs: 2411104
> > > > TREE04.59 ------- 800516 GPs (4.94146/s) n_max_cbs: 3634471
> > > > QEMU killed
> > > > TREE04.59 no success message, 10547 successful version messages
> > > > ^[[033mWARNING: ^[[mTREE04.59 GP HANG at 800516 torture stat 1925
> > > > ^[[033mWARNING: ^[[mAssertion failure in
> > > > /home/linux/tools/testing/selftests/rcutorture/res/2022.09.26-23.33.34-remote/TREE04.59/console.log
> > > > TREE04.59
> > > > ^[[033mWARNING: ^[[mSummary: Call Traces: 1 Stalls: 8615
> > > > TREE04.6 ------- 4348443 GPs (26.8422/s) [rcu: g35341129 f0x0
> > > > total-gps=8835575] n_max_cbs: 2329432
> > >
> > > First, thank you for running this!
> > >
> > > This is not the typical failure that we were seeing, which would show
> > > up as a 2.199.0-second RCU CPU stall during which time there would be
> > > no console messages.
> > >
> > > But please do let me know how continuing tests go!
> > >
> > 
> > This time, the same test environment except against v6.0-rc7 mainline,
> > also encountered the not typical failure.
> 
> Interesting, I'm trying to reproduce...

Very good, thank you!

							Thanx, Paul



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux