Hi, Sorry about the duplicated message, but it looks like my previous email contained some html code that got rejected by the linux-block list. We've noticed a kernel oops during the stress-ng test on aarch64 more log details on [1]. Christoph, do you think this could be related to the recent blk_cleanup_disk changes [2]? [15259.574356] loop32292: detected capacity change from 0 to 4096 [15259.574436] loop6370: detected capacity change from 0 to 4096 [15259.638249] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [15259.647046] Mem abort info: [15259.649830] ESR = 0x96000006 [15259.652875] EC = 0x25: DABT (current EL), IL = 32 bits [15259.653800] loop46040: detected capacity change from 4096 to 8192 [15259.658191] SET = 0, FnV = 0 [15259.667311] EA = 0, S1PTW = 0 [15259.670442] Data abort info: [15259.673311] ISV = 0, ISS = 0x00000006 [15259.677145] CM = 0, WnR = 0 [15259.680102] user pgtable: 4k pages, 48-bit VAs, pgdp=000000093ce30000 [15259.686547] [0000000000000008] pgd=080000092b670003, p4d=080000092b670003, pud=0800000911225003, pmd=0000000000000000 [15259.697181] Internal error: Oops: 96000006 [#1] SMP [15259.702069] Modules linked in: binfmt_misc fcrypt sm4_generic crc32_generic md4 michael_mic nhpoly1305_neon nhpoly1305 poly1305_generic libpoly1305 poly1305_neon rmd160 sha3_generic sm3_generic streebog_generic wp512 blowfish_generic blowfish_common cast5_generic des_generic libdes chacha_generic chacha_neon libchacha camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common dm_thin_pool dm_persistent_data dm_bio_prison nvme nvme_core loop dm_log_writes dm_flakey rfkill mlx5_ib ib_uverbs ib_core sunrpc mlx5_core joydev acpi_ipmi psample ipmi_ssif i2c_smbus mlxfw ipmi_devintf ipmi_msghandler thunderx2_pmu vfat fat cppc_cpufreq fuse zram ip_tables xfs crct10dif_ce ast ghash_ce i2c_algo_bit drm_vram_helper drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm drm gpio_xlp i2c_xlp9xx uas usb_storage aes_neon_bs [last unloaded: nvmet] [15259.781079] CPU: 2 PID: 2800640 Comm: stress-ng Not tainted 5.13.0-rc3 #1 [15259.787865] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.11 06/18/2019 [15259.797601] pstate: 60400009 (nZCv daif +PAN -UAO -TCO BTYPE=--) [15259.803605] pc : blk_mq_run_hw_queues+0xec/0x10c [15259.808226] lr : blk_freeze_queue_start+0x80/0x90 [15259.812925] sp : ffff80003b55bd00 [15259.816233] x29: ffff80003b55bd00 x28: ffff000a559320c0 x27: 0000000000000000 [15259.823375] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [15259.830513] x23: 0000000000000007 x22: 0000000000000000 x21: 0000000000000000 [15259.837645] x20: ffff00081aa6d3c0 x19: ffff00081aa6d3c0 x18: 00000000fffffffa [15259.844776] x17: 0000000000000000 x16: 0000000000000000 x15: 0000040000000000 [15259.851905] x14: ffff000000000000 x13: 0000000000001000 x12: ffff000e7825b0a0 [15259.859034] x11: 0000000000000000 x10: ffff000e7825b098 x9 : ffff8000106d2950 [15259.866164] x8 : ffff000f7cfeab20 x7 : fffffffc00000000 x6 : ffff800011554000 [15259.873292] x5 : 0000000000000000 x4 : ffff000a559320c0 x3 : ffff00081aa6da28 [15259.880421] x2 : 0000000000000002 x1 : 0000000000000000 x0 : ffff0008b69f0a80 [15259.887551] Call trace: [15259.889987] blk_mq_run_hw_queues+0xec/0x10c [15259.894253] blk_freeze_queue_start+0x80/0x90 [15259.898603] blk_cleanup_queue+0x40/0x114 [15259.902606] blk_cleanup_disk+0x28/0x50 [15259.906434] loop_control_ioctl+0x17c/0x190 [loop] [15259.911224] __arm64_sys_ioctl+0xb4/0x100 [15259.915229] invoke_syscall+0x50/0x120 [15259.918972] el0_svc_common.constprop.0+0x4c/0xd4 [15259.923666] do_el0_svc+0x30/0x9c [15259.926971] el0_svc+0x2c/0x54 [15259.930022] el0_sync_handler+0x1a4/0x1b0 [15259.934023] el0_sync+0x19c/0x1c0 [15259.937335] Code: 91000000 b8626802 f9400021 f9402680 (b8627821) [15259.943418] ---[ end trace 975879698e5c9146 ]--- [15260.113777] loop62523: detected capacity change from 4096 to 8192 [15260.113783] loop58780: detected capacity change from 4096 to 8192 [15260.113794] loop7620: detected capacity change from 4096 to 8192 [1] https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/datawarehouse-public/2021/06/11/319533768/build_aarch64_redhat%3A1340796730/tests/10127520_aarch64_2_console.log [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/log/?h=for-next Thanks, Bruno On Mon, Jun 14, 2021 at 2:35 PM CKI Project <cki-project@xxxxxxxxxx> wrote: > > > Hello, > > We ran automated tests on a recent commit from this kernel tree: > > Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git > Commit: 30ec225aae2e - Merge branch 'for-5.14/block' into for-next > > The results of these automated tests are provided below. > > Overall result: FAILED (see details below) > Merge: OK > Compile: OK > Tests: PANICKED > > All kernel binaries, config files, and logs are available for download here: > > https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/06/11/319533768 > > One or more kernel tests failed: > > ppc64le: > Boot test > > aarch64: > ❌ storage: software RAID testing > stress: stress-ng > > We hope that these logs can help you find the problem quickly. For the full > detail on our testing procedures, please scroll to the bottom of this message. > > Please reply to this email if you have any questions about the tests that we > ran or if you have any suggestions on how to make future tests more effective. > > ,-. ,-. > ( C ) ( K ) Continuous > `-',-.`-' Kernel > ( I ) Integration > `-' > ______________________________________________________________________________ > > Compile testing > --------------- > > We compiled the kernel for 4 architectures: > > aarch64: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > ppc64le: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > s390x: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > x86_64: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > > > Hardware testing > ---------------- > We booted each kernel and ran the following tests: > > aarch64: > Host 1: > ✅ Boot test > ✅ ACPI table test > ✅ LTP > ✅ CIFS Connectathon > ✅ POSIX pjd-fstest suites > ✅ Loopdev Sanity > ✅ Memory: fork_mem > ✅ Memory function: memfd_create > ✅ AMTU (Abstract Machine Test Utility) > ✅ Ethernet drivers sanity > ✅ storage: SCSI VPD > ✅ xarray-idr-radixtree-test > > Host 2: > ✅ Boot test > ✅ xfstests - ext4 > ✅ xfstests - xfs > ❌ storage: software RAID testing > ✅ Storage: swraid mdadm raid_module test > ✅ xfstests - btrfs > ✅ Storage blktests > ✅ Storage block - filesystem fio test > ✅ Storage block - queue scheduler test > ✅ Storage nvme - tcp > ✅ Storage: lvm device-mapper test > stress: stress-ng > > ppc64le: > Host 1: > ✅ Boot test > ✅ LTP > ✅ CIFS Connectathon > ✅ POSIX pjd-fstest suites > ✅ Loopdev Sanity > ✅ Memory: fork_mem > ✅ Memory function: memfd_create > ✅ AMTU (Abstract Machine Test Utility) > ✅ Ethernet drivers sanity > ✅ xarray-idr-radixtree-test > > Host 2: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ storage: software RAID testing > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage block - filesystem fio test > ⚡⚡⚡ Storage block - queue scheduler test > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ Storage: lvm device-mapper test > > Host 3: > Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ storage: software RAID testing > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage block - filesystem fio test > ⚡⚡⚡ Storage block - queue scheduler test > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ Storage: lvm device-mapper test > > s390x: > Host 1: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ✅ Boot test > ⚡⚡⚡ LTP > ⚡⚡⚡ CIFS Connectathon > ⚡⚡⚡ POSIX pjd-fstest suites > ⚡⚡⚡ Loopdev Sanity > ⚡⚡⚡ Memory: fork_mem > ⚡⚡⚡ Memory function: memfd_create > ⚡⚡⚡ AMTU (Abstract Machine Test Utility) > ⚡⚡⚡ Ethernet drivers sanity > ⚡⚡⚡ xarray-idr-radixtree-test > > Host 2: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ stress: stress-ng > > Host 3: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ stress: stress-ng > > Host 4: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ stress: stress-ng > > x86_64: > Host 1: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ Storage SAN device stress - qedf driver > > Host 2: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ xfstests - nfsv4.2 > ⚡⚡⚡ storage: software RAID testing > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ xfstests - cifsv3.11 > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage block - filesystem fio test > ⚡⚡⚡ Storage block - queue scheduler test > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ Storage: lvm device-mapper test > ⚡⚡⚡ stress: stress-ng > > Host 3: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ Storage SAN device stress - qla2xxx driver > > Host 4: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ✅ Boot test > ⚡⚡⚡ Storage SAN device stress - mpt3sas_gen1 > > Host 5: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ✅ Boot test > ✅ ACPI table test > ⚡⚡⚡ LTP > ⚡⚡⚡ CIFS Connectathon > ⚡⚡⚡ POSIX pjd-fstest suites > ⚡⚡⚡ Loopdev Sanity > ⚡⚡⚡ Memory: fork_mem > ⚡⚡⚡ Memory function: memfd_create > ⚡⚡⚡ AMTU (Abstract Machine Test Utility) > ⚡⚡⚡ Ethernet drivers sanity > ⚡⚡⚡ storage: SCSI VPD > ⚡⚡⚡ xarray-idr-radixtree-test > > Host 6: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ Storage SAN device stress - lpfc driver > > Host 7: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ Storage SAN device stress - qedf driver > > Host 8: > > ⚡ Internal infrastructure issues prevented one or more tests (marked > with ⚡⚡⚡) from running on this architecture. > This is not the fault of the kernel that was tested. > > ⚡⚡⚡ Boot test > ⚡⚡⚡ Storage SAN device stress - lpfc driver > > Test sources: https://gitlab.com/cki-project/kernel-tests > Pull requests are welcome for new tests or improvements to existing tests! > > Aborted tests > ------------- > Tests that didn't complete running successfully are marked with ⚡⚡⚡. > If this was caused by an infrastructure issue, we try to mark that > explicitly in the report. > > Waived tests > ------------ > If the test run included waived tests, they are marked with . Such tests are > executed but their results are not taken into account. Tests are waived when > their results are not reliable enough, e.g. when they're just introduced or are > being fixed. > > Testing timeout > --------------- > We aim to provide a report within reasonable timeframe. Tests that haven't > finished running yet are marked with ⏱. > >