On 3/9/20 12:42 PM, Paolo Valente wrote:
Hi Rachel,
IIUC, you can reproduce this bug reliably. If so, I'd need you to test a debugging patch (on top of one of the offending kernels).
Hi Paolo,
Yes seems we have seen it pretty consistently in the last three reports, but I'm cloning the job to be sure we can
reproduce reliably. In the mean time, feel free to send me a pointer to your debugging patch so I can retry with
the patch applied.
Thank you,
Rachel
Looking forward to your feedback,
Paolo
Il giorno 9 mar 2020, alle ore 15:27, Rachel Sibley <rasibley@xxxxxxxxxx> ha scritto:
(cc'ing linux-block@xxxxxxxxxxxxxxx)
Hello,
We are seeing a kernel panic triggered with LTP and xfstests against a recent commit for mainline,
wanted to share in case it's not already known.
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Commit: 61a09258f2e5 - Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
We have also seen it with 2c523b344dfa and 378fee2e6b12 commits as well.
LTP: https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/03/08/477469/x86_64_1_console.log
xfstests: https://cki-artifacts.s3.us-east-2.amazonaws.com/datawarehouse/2020/03/08/477469/x86_64_4_console.log
[-- MARK -- Sun Mar 8 02:45:00 2020]
[ 762.315610] BUG: kernel NULL pointer dereference, address: 0000000000000158
[ 762.323385] #PF: supervisor read access in kernel mode
[ 762.329119] #PF: error_code(0x0000) - not-present page
[ 762.334853] PGD 0 P4D 0
[ 762.337680] Oops: 0000 [#1] SMP PTI
[ 762.341575] CPU: 9 PID: 87 Comm: kworker/9:1 Not tainted 5.6.0-rc4-61a0925.cki #1
[ 762.349927] Hardware name: Cisco Systems, Inc. UCS-E160DP-M1/K9/UCS-E160DP-M1/K9, BIOS UCSED.1.5.0.2.051520131757 05/15/2013
[ 762.362453] Workqueue: cgroup_destroy css_killed_work_fn
[ 762.368387] RIP: 0010:bfq_bfqq_expire+0x1c/0x940
[ 762.373540] Code: 01 00 00 c7 80 f8 00 00 00 01 00 00 00 c3 66 66 66 66 90 41 57 41 56 41 55 41 54 41 89 cc 55 48 89 fd 53 48 89 f3 48 83 ec 28 <8b> be 58 01 00 00 65 48 8b 04 25 28 00 00 00 48 89 44 24 20 31 c0
[ 762.394500] RSP: 0018:ffff9927c03bbd50 EFLAGS: 00010086
[ 762.400331] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000004
[ 762.408301] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8965a3913800
[ 762.416270] RBP: ffff8965a3913800 R08: ffff896592d41098 R09: ffff89657aa8df00
[ 762.424233] R10: 0000000000000000 R11: ffff89657aa8df00 R12: 0000000000000004
[ 762.432200] R13: ffff89659f0cd9b0 R14: ffff8965a3913bf0 R15: ffff89659f0cd898
[ 762.440175] FS: 0000000000000000(0000) GS:ffff8965a7c40000(0000) knlGS:0000000000000000
[ 762.449211] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 762.455622] CR2: 0000000000000158 CR3: 000000065afc6003 CR4: 00000000000606e0
[ 762.463599] Call Trace:
[ 762.466341] ? bfq_idle_extract+0x40/0xb0
[ 762.470821] bfq_bfqq_move+0x14f/0x160
[ 762.475011] bfq_pd_offline+0xd3/0xf0
[ 762.479112] blkg_destroy+0x52/0xf0
[ 762.483005] blkcg_destroy_blkgs+0x4f/0xa0
[ 762.487582] css_killed_work_fn+0x4d/0xd0
[ 762.492066] process_one_work+0x1b5/0x360
[ 762.496547] worker_thread+0x50/0x3c0
[ 762.500641] kthread+0xf9/0x130
[ 762.504153] ? process_one_work+0x360/0x360
[ 762.508813] ? kthread_park+0x90/0x90
[ 762.512909] ret_from_fork+0x35/0x40
Thanks,
Rachel
On 3/7/20 9:59 PM, CKI Project wrote:
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
Commit: 61a09258f2e5 - Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
The results of these automated tests are provided below.
Overall result: FAILED (see details below)
Merge: OK
Compile: OK
Tests: FAILED
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=datawarehouse/2020/03/08/477469
One or more kernel tests failed:
x86_64:
❌ LTP
❌ xfstests - ext4
We hope that these logs can help you find the problem quickly. For the full
detail on our testing procedures, please scroll to the bottom of this message.
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 1 architecture:
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
x86_64:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
❌ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking MACsec: sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking sctp-auth: sockopts test
⚡⚡⚡ Networking: igmp conformance test
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - transport
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ audit: audit testsuite test
⚡⚡⚡ httpd: mod_ssl smoke sanity
⚡⚡⚡ tuned: tune-processes-through-perf
⚡⚡⚡ pciutils: sanity smoke test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
⚡⚡⚡ storage: SCSI VPD
⚡⚡⚡ trace: ftrace/tracer
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ LTP: openposix test suite
🚧 ⚡⚡⚡ Networking vnic: ipvlan/basic
🚧 ⚡⚡⚡ iotop: sanity
🚧 ⚡⚡⚡ Usex - version 1.9-29
🚧 ⚡⚡⚡ storage: dm/common
Host 2:
✅ Boot test
✅ Storage SAN device stress - mpt3sas driver
Host 3:
✅ Boot test
✅ Storage SAN device stress - megaraid_sas
Host 4:
✅ Boot test
❌ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ lvm thinp sanity
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
Test sources: https://github.com/CKI-project/tests-beaker
💚 Pull requests are welcome for new tests or improvements to existing tests!
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.