Hello, We are sending this report regarding the kernel oops on ppc64le during "CIFS Connectathon" test. https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/datawarehouse-public/2021/06/01/312989200/build_ppc64le_redhat%3A1309223508/tests/10078627_ppc64le_1_console.log We are sending this to the block tree as we hit this problem only when testing a kernel from this tree. We are not able to easily reproduce it either (So far we've seen this only twice). [10904.818294] CIFS: Attempting to mount \\ibm-p9z-16-lp1.lab.eng.bos.redhat.com\testuser [10905.210957] BUG: Unable to handle kernel data access on write at 0x99a0978ccb2c73a5 [10905.210972] Faulting instruction address: 0xc0000000004a9d60 [10905.210978] Oops: Kernel access of bad area, sig: 11 [#1] [10905.210983] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries [10905.210991] Modules linked in: md4 cmac cifs libdes libarc4 dns_resolver nls_utf8 isofs kvm_pr kvm snd_seq_dummy dummy veth minix binfmt_misc can_raw can nfsv3 nfs_acl nfs lockd grace fscache netfs rds tun brd overlay exfat vfat fat loop n_gsm pps_ldisc ppp_synctty mkiss ax25 ppp_async ppp_generic serport slcan slip slhc snd_hrtimer snd_seq snd_seq_device sctp ip6_udp_tunnel udp_tunnel snd_timer snd soundcore authenc pcrypt crypto_user sha3_generic n_hdlc bonding tls rfkill sunrpc ibmveth pseries_rng crct10dif_vpmsum drm drm_panel_orientation_quirks fuse i2c_core zram ip_tables xfs ibmvscsi scsi_transport_srp vmx_crypto crc32c_vpmsum [last unloaded: ltp_insmod01] [10905.211103] CPU: 0 PID: 398969 Comm: grepconf.sh Tainted: G OE 5.13.0-rc3 #1 [10905.211111] NIP: c0000000004a9d60 LR: c0000000004a9e94 CTR: 0000000000000000 [10905.211117] REGS: c0000000187e36b0 TRAP: 0380 Tainted: G OE (5.13.0-rc3) [10905.211124] MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 84000200 XER: 00000000 [10905.211138] CFAR: c0000000004a9c60 IRQMASK: 0 [10905.211138] GPR00: 146b53b1bb405e6b c0000000187e3950 c000000001e30e00 0000000000000000 [10905.211138] GPR04: 0000000000000dc0 00000000000f1cc0 0000000000be01c0 c0000007ff793f30 [10905.211138] GPR08: 0000000000000008 99a0978ccb2c739d c0000000017e3f30 c0000000190a7c00 [10905.211138] GPR12: c0000000190a7c78 c000000002070000 0000000000000000 0000000000000000 [10905.211138] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [10905.211138] GPR20: 0000000000000000 00007fffe215fb15 c00000000e90ed00 ffffffffffffffff [10905.211138] GPR24: 0000000010014261 0000000000000000 c000000001e689b8 c00000000084c804 [10905.211138] GPR28: 0000000000000dc0 0000000000000000 0000000000000001 c000000007013e00 [10905.211215] NIP [c0000000004a9d60] kmem_cache_alloc+0x190/0x480 [10905.211225] LR [c0000000004a9e94] kmem_cache_alloc+0x2c4/0x480 [10905.211232] Call Trace: [10905.211242] [c0000000187e3950] [c0000000004a9e84] kmem_cache_alloc+0x2b4/0x480 (unreliable) [10905.211251] [c0000000187e39c0] [c00000000084c804] security_file_alloc+0x44/0xf0 [10905.211261] [c0000000187e3a00] [c000000000504738] __alloc_file+0x78/0x130 [10905.211269] [c0000000187e3a40] [c000000000504e58] alloc_empty_file+0x78/0x170 [10905.211278] [c0000000187e3ac0] [c00000000051b318] path_openat+0x58/0x12d0 [10905.211287] [c0000000187e3bd0] [c00000000051f070] do_filp_open+0x90/0x140 [10905.211295] [c0000000187e3cf0] [c0000000004ff0e8] do_sys_openat2+0xf8/0x1f0 [10905.211303] [c0000000187e3d60] [c0000000004ff460] sys_openat+0x60/0xc0 [10905.211310] [c0000000187e3db0] [c00000000002c394] system_call_exception+0x104/0x2c0 [10905.211319] [c0000000187e3e10] [c00000000000d45c] system_call_common+0xec/0x278 [10905.211327] --- interrupt: c00 at 0x7fffbca93150 Bruno On Fri, Jun 11, 2021 at 4:47 PM CKI Project <cki-project@xxxxxxxxxx> wrote: > > > Hello, > > We ran automated tests on a recent commit from this kernel tree: > > Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git > Commit: 314e07c78aef - Merge branch 'for-5.14/block' into for-next > > The results of these automated tests are provided below. > > Overall result: FAILED (see details below) > Merge: OK > Compile: OK > Tests: PANICKED > > All kernel binaries, config files, and logs are available for download here: > > https://arr-cki-prod-datawarehouse-public.s3.amazonaws.com/index.html?prefix=datawarehouse-public/2021/06/01/312989200 > > One or more kernel tests failed: > > ppc64le: > CIFS Connectathon > Boot test > > aarch64: > stress: stress-ng > ❌ LTP > > We hope that these logs can help you find the problem quickly. For the full > detail on our testing procedures, please scroll to the bottom of this message. > > Please reply to this email if you have any questions about the tests that we > ran or if you have any suggestions on how to make future tests more effective. > > ,-. ,-. > ( C ) ( K ) Continuous > `-',-.`-' Kernel > ( I ) Integration > `-' > ______________________________________________________________________________ > > Compile testing > --------------- > > We compiled the kernel for 4 architectures: > > aarch64: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > ppc64le: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > s390x: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > x86_64: > make options: make -j24 INSTALL_MOD_STRIP=1 targz-pkg > > > > Hardware testing > ---------------- > We booted each kernel and ran the following tests: > > aarch64: > Host 1: > ✅ Boot test > ✅ xfstests - ext4 > ✅ xfstests - xfs > ✅ storage: software RAID testing > ✅ Storage: swraid mdadm raid_module test > ✅ xfstests - btrfs > ✅ Storage blktests > ✅ Storage block - filesystem fio test > ✅ Storage block - queue scheduler test > ✅ Storage nvme - tcp > ✅ Storage: lvm device-mapper test > stress: stress-ng > > Host 2: > ✅ Boot test > ✅ ACPI table test > ❌ LTP > ✅ CIFS Connectathon > ✅ POSIX pjd-fstest suites > ✅ Loopdev Sanity > ✅ Memory: fork_mem > ✅ Memory function: memfd_create > ✅ AMTU (Abstract Machine Test Utility) > ✅ Ethernet drivers sanity > ✅ storage: SCSI VPD > ✅ xarray-idr-radixtree-test > > ppc64le: > Host 1: > ✅ Boot test > ✅ LTP > CIFS Connectathon > ⚡⚡⚡ POSIX pjd-fstest suites > ⚡⚡⚡ Loopdev Sanity > ⚡⚡⚡ Memory: fork_mem > ⚡⚡⚡ Memory function: memfd_create > ⚡⚡⚡ AMTU (Abstract Machine Test Utility) > ⚡⚡⚡ Ethernet drivers sanity > ⚡⚡⚡ xarray-idr-radixtree-test > > Host 2: > ❌ Boot test > ⚡⚡⚡ xfstests - ext4 > ⚡⚡⚡ xfstests - xfs > ⚡⚡⚡ storage: software RAID testing > ⚡⚡⚡ Storage: swraid mdadm raid_module test > ⚡⚡⚡ xfstests - btrfs > ⚡⚡⚡ Storage blktests > ⚡⚡⚡ Storage block - filesystem fio test > ⚡⚡⚡ Storage block - queue scheduler test > ⚡⚡⚡ Storage nvme - tcp > ⚡⚡⚡ Storage: lvm device-mapper test > > s390x: > Host 1: > ✅ Boot test > ✅ LTP > ✅ CIFS Connectathon > ✅ POSIX pjd-fstest suites > ✅ Loopdev Sanity > ✅ Memory: fork_mem > ✅ Memory function: memfd_create > ✅ AMTU (Abstract Machine Test Utility) > ✅ Ethernet drivers sanity > ❌ xarray-idr-radixtree-test > > Host 2: > ✅ Boot test > ✅ xfstests - ext4 > ✅ xfstests - xfs > ✅ Storage: swraid mdadm raid_module test > ✅ xfstests - btrfs > ✅ Storage blktests > ✅ Storage nvme - tcp > ✅ stress: stress-ng > > x86_64: > Host 1: > ✅ Boot test > ✅ Storage SAN device stress - lpfc driver > > Host 2: > ✅ Boot test > ✅ Storage SAN device stress - mpt3sas_gen1 > > Host 3: > ✅ Boot test > ✅ xfstests - ext4 > ✅ xfstests - xfs > ✅ xfstests - nfsv4.2 > ✅ storage: software RAID testing > ✅ Storage: swraid mdadm raid_module test > ✅ xfstests - btrfs > ❌ xfstests - cifsv3.11 > ✅ Storage blktests > ✅ Storage block - filesystem fio test > ✅ Storage block - queue scheduler test > ✅ Storage nvme - tcp > ✅ Storage: lvm device-mapper test > ✅ stress: stress-ng > > Host 4: > ✅ Boot test > ✅ ACPI table test > ✅ LTP > ✅ CIFS Connectathon > ✅ POSIX pjd-fstest suites > ✅ Loopdev Sanity > ✅ Memory: fork_mem > ✅ Memory function: memfd_create > ✅ AMTU (Abstract Machine Test Utility) > ✅ Ethernet drivers sanity > ✅ storage: SCSI VPD > ✅ xarray-idr-radixtree-test > > Host 5: > ✅ Boot test > ✅ Storage SAN device stress - qedf driver > > Host 6: > ✅ Boot test > ✅ Storage SAN device stress - qla2xxx driver > > Test sources: https://gitlab.com/cki-project/kernel-tests > Pull requests are welcome for new tests or improvements to existing tests! > > Aborted tests > ------------- > Tests that didn't complete running successfully are marked with ⚡⚡⚡. > If this was caused by an infrastructure issue, we try to mark that > explicitly in the report. > > Waived tests > ------------ > If the test run included waived tests, they are marked with . Such tests are > executed but their results are not taken into account. Tests are waived when > their results are not reliable enough, e.g. when they're just introduced or are > being fixed. > > Testing timeout > --------------- > We aim to provide a report within reasonable timeframe. Tests that haven't > finished running yet are marked with ⏱. > >