----- Original Message ----- > From: "Ming Lei" <ming.lei@xxxxxxxxxx> > To: "Veronika Kabatova" <vkabatov@xxxxxxxxxx> > Cc: linux-block@xxxxxxxxxxxxxxx, axboe@xxxxxxxxx, "CKI Project" <cki-project@xxxxxxxxxx>, "Changhui Zhong" > <czhong@xxxxxxxxxx> > Sent: Sunday, September 6, 2020 5:19:08 AM > Subject: Re: 💥 PANICKED: Test report for?kernel 5.9.0-rc3-020ad03.cki (block) > > Hi Veronika, > > On Fri, Sep 04, 2020 at 07:06:25AM -0400, Veronika Kabatova wrote: > > > > > > ----- Original Message ----- > > > From: "Ming Lei" <ming.lei@xxxxxxxxxx> > > > To: "CKI Project" <cki-project@xxxxxxxxxx> > > > Cc: linux-block@xxxxxxxxxxxxxxx, axboe@xxxxxxxxx, "Changhui Zhong" > > > <czhong@xxxxxxxxxx> > > > Sent: Friday, September 4, 2020 3:02:33 AM > > > Subject: Re: 💥 PANICKED: Test report for kernel 5.9.0-rc3-020ad03.cki > > > (block) > > > > > > On Thu, Sep 03, 2020 at 05:07:57PM -0000, CKI Project wrote: > > > > > > > > Hello, > > > > > > > > We ran automated tests on a recent commit from this kernel tree: > > > > > > > > Kernel repo: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git > > > > Commit: 020ad0333b03 - Merge branch 'for-5.10/block' into > > > > for-next > > > > > > > > The results of these automated tests are provided below. > > > > > > > > Overall result: FAILED (see details below) > > > > Merge: OK > > > > Compile: OK > > > > Tests: PANICKED > > > > > > > > All kernel binaries, config files, and logs are available for download > > > > here: > > > > > > > > https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/09/02/613166 > > > > > > > > One or more kernel tests failed: > > > > > > > > ppc64le: > > > > 💥 storage: software RAID testing > > > > > > > > aarch64: > > > > 💥 storage: software RAID testing > > > > > > > > x86_64: > > > > 💥 storage: software RAID testing > > > > > > > > We hope that these logs can help you find the problem quickly. For the > > > > full > > > > detail on our testing procedures, please scroll to the bottom of this > > > > message. > > > > > > > > Please reply to this email if you have any questions about the tests > > > > that > > > > we > > > > ran or if you have any suggestions on how to make future tests more > > > > effective. > > > > > > > > ,-. ,-. > > > > ( C ) ( K ) Continuous > > > > `-',-.`-' Kernel > > > > ( I ) Integration > > > > `-' > > > > ______________________________________________________________________________ > > > > > > > > Compile testing > > > > --------------- > > > > > > > > We compiled the kernel for 4 architectures: > > > > > > > > aarch64: > > > > make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg > > > > > > > > ppc64le: > > > > make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg > > > > > > > > s390x: > > > > make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg > > > > > > > > x86_64: > > > > make options: make -j30 INSTALL_MOD_STRIP=1 targz-pkg > > > > > > > > > > > > > > > > Hardware testing > > > > ---------------- > > > > We booted each kernel and ran the following tests: > > > > > > > > aarch64: > > > > Host 1: > > > > ✅ Boot test > > > > ✅ ACPI table test > > > > ✅ LTP > > > > ✅ Loopdev Sanity > > > > ✅ Memory function: memfd_create > > > > ✅ AMTU (Abstract Machine Test Utility) > > > > ✅ Ethernet drivers sanity > > > > ✅ storage: SCSI VPD > > > > 🚧 ✅ CIFS Connectathon > > > > 🚧 ✅ POSIX pjd-fstest suites > > > > > > > > Host 2: > > > > > > > > ⚡ Internal infrastructure issues prevented one or more tests > > > > (marked > > > > with ⚡⚡⚡) from running on this architecture. > > > > This is not the fault of the kernel that was tested. > > > > > > > > ⚡⚡⚡ Boot test > > > > ⚡⚡⚡ xfstests - ext4 > > > > ⚡⚡⚡ xfstests - xfs > > > > ⚡⚡⚡ storage: software RAID testing > > > > ⚡⚡⚡ stress: stress-ng > > > > 🚧 ⚡⚡⚡ xfstests - btrfs > > > > 🚧 ⚡⚡⚡ Storage blktests > > > > > > > > Host 3: > > > > ✅ Boot test > > > > ✅ xfstests - ext4 > > > > ✅ xfstests - xfs > > > > 💥 storage: software RAID testing > > > > ⚡⚡⚡ stress: stress-ng > > > > 🚧 ⚡⚡⚡ xfstests - btrfs > > > > 🚧 ⚡⚡⚡ Storage blktests > > > > > > > > ppc64le: > > > > Host 1: > > > > ✅ Boot test > > > > 🚧 ✅ kdump - sysrq-c > > > > > > > > Host 2: > > > > ✅ Boot test > > > > ✅ xfstests - ext4 > > > > ✅ xfstests - xfs > > > > 💥 storage: software RAID testing > > > > 🚧 ⚡⚡⚡ xfstests - btrfs > > > > 🚧 ⚡⚡⚡ Storage blktests > > > > > > > > Host 3: > > > > > > > > ⚡ Internal infrastructure issues prevented one or more tests > > > > (marked > > > > with ⚡⚡⚡) from running on this architecture. > > > > This is not the fault of the kernel that was tested. > > > > > > > > ✅ Boot test > > > > ⚡⚡⚡ LTP > > > > ⚡⚡⚡ Loopdev Sanity > > > > ⚡⚡⚡ Memory function: memfd_create > > > > ⚡⚡⚡ AMTU (Abstract Machine Test Utility) > > > > ⚡⚡⚡ Ethernet drivers sanity > > > > 🚧 ⚡⚡⚡ CIFS Connectathon > > > > 🚧 ⚡⚡⚡ POSIX pjd-fstest suites > > > > > > > > s390x: > > > > Host 1: > > > > ✅ Boot test > > > > ✅ stress: stress-ng > > > > 🚧 ✅ Storage blktests > > > > > > > > Host 2: > > > > ✅ Boot test > > > > ✅ LTP > > > > ✅ Loopdev Sanity > > > > ✅ Memory function: memfd_create > > > > ✅ AMTU (Abstract Machine Test Utility) > > > > ✅ Ethernet drivers sanity > > > > 🚧 ✅ CIFS Connectathon > > > > 🚧 ✅ POSIX pjd-fstest suites > > > > > > > > x86_64: > > > > Host 1: > > > > ✅ Boot test > > > > ✅ Storage SAN device stress - qedf driver > > > > > > > > Host 2: > > > > ⏱ Boot test > > > > ⏱ Storage SAN device stress - mpt3sas_gen1 > > > > > > > > Host 3: > > > > ✅ Boot test > > > > ✅ xfstests - ext4 > > > > ✅ xfstests - xfs > > > > 💥 storage: software RAID testing > > > > ⚡⚡⚡ stress: stress-ng > > > > 🚧 ⚡⚡⚡ xfstests - btrfs > > > > 🚧 ⚡⚡⚡ Storage blktests > > > > > > > > Host 4: > > > > ✅ Boot test > > > > ✅ Storage SAN device stress - lpfc driver > > > > > > > > Host 5: > > > > ✅ Boot test > > > > 🚧 ✅ kdump - sysrq-c > > > > > > > > Host 6: > > > > ✅ Boot test > > > > ✅ ACPI table test > > > > ✅ LTP > > > > ✅ Loopdev Sanity > > > > ✅ Memory function: memfd_create > > > > ✅ AMTU (Abstract Machine Test Utility) > > > > ✅ Ethernet drivers sanity > > > > ✅ kernel-rt: rt_migrate_test > > > > ✅ kernel-rt: rteval > > > > ✅ kernel-rt: sched_deadline > > > > ✅ kernel-rt: smidetect > > > > ✅ storage: SCSI VPD > > > > 🚧 ✅ CIFS Connectathon > > > > 🚧 ✅ POSIX pjd-fstest suites > > > > > > > > Host 7: > > > > ✅ Boot test > > > > ✅ kdump - sysrq-c - megaraid_sas > > > > > > > > Host 8: > > > > ✅ Boot test > > > > ✅ Storage SAN device stress - qla2xxx driver > > > > > > > > Host 9: > > > > ⏱ Boot test > > > > ⏱ kdump - sysrq-c - mpt3sas_gen1 > > > > > > > > Test sources: https://gitlab.com/cki-project/kernel-tests > > > > > > Hello, > > > > > > > Hi Ming, > > > > first the good news: Both issues detected by LTP and RAID test are > > officially gone after the revert. There's some x86_64 testing still > > running but the results look good so far! > > > > > Can you share us the exact commands for setting up xfstests over > > > 'software RAID testing' from the above tree? > > > > > > > It's this test (which seeing your @redhat email, you can also trigger > > via internal Brew testing if you use the "stor" test set): > > > > https://gitlab.com/cki-project/kernel-tests/-/tree/master/storage/swraid/trim > > > > The important part of the test is: > > > > https://gitlab.com/cki-project/kernel-tests/-/blob/master/storage/swraid/trim/main.sh#L27 > > > > The test maintainer (Changhui) is cced on this thread in case you need > > any help or have questions about the test. > > > > > > > > I'll just quickly mention, please be careful if you're planning on > > testing LTP/msgstress04 on ppc64le in Beaker, as the conserver overload > > is causing issues to lab owners. > > > > > > Let us know if we can help you with something else, > > I have verified the revised patches does fix kernel oops in 'software > RAID storage test'. However, I can't reproduce the OOM in LTP/msgstress04. > > Could you help to check if LTP/msgstress04 can pass with the following > tree(top three patches) which is against the latest for-5.10/block: > > https://github.com/ming1/linux/commits/v5.9-rc-block-test > Hi, I ran the affected ppc64le testing with your new patches and it gives the expected results. We also got in touch with the LTP test maintainers. It looks like there are some issues with the msgstress tests as well. These got amplified by the patch and the combination caused the conserver overload. The tests themselves need to be fixed too. Veronika > Thanks, > Ming > >