On Fri, Aug 16, 2019 at 7:15 PM Michael Ellerman <mpe@xxxxxxxxxxxxxx> wrote: > > Major Hayden <major@xxxxxxxxxx> writes: > > Hello there, > > > > The CKI Project just found a kernel panic while running the blktests > > test suite on stable 5.2.9-rc1[0]. Michael Ellerman requested for this > > list to be copied on these ppc64le failures. > > > > We have some logs[1] for these failures and they start with > > "ppc64le_host_2_Storage_blktests*". We hope this helps! > > > > [0] https://lore.kernel.org/stable/255f9af4-6087-7f56-5860-5aa0397a7631@xxxxxxxxxx/T/#t > > [1] https://artifacts.cki-project.org/pipelines/100875/logs/ > > Thanks for the report. > > It looks like you tested the stable queue yesterday, which AFAICS > results in the exact same source tree as you tested above, and yet > yesterday you didn't see the failure. So it's intermittent, which is > annoying. > > Looking at the oops: > > [ 7101.930385] NIP: c00000000067b230 LR: c00000000067b1d4 CTR: c000000000029140 > [ 7101.930400] REGS: c00020391ccc35c0 TRAP: 0300 Not tainted (5.2.9-rc1-2440e48.cki) > [ 7101.930413] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002228 XER: 20040000 > [ 7101.930433] CFAR: c0000000001d9e28 DAR: 800a00066b9e7e28 DSISR: 40000000 IRQMASK: 0 > GPR00: c00000000067b1d4 c00020391ccc3850 c0000000016cdb00 0000067578a4ab31 > GPR04: 0000000000000000 ffffffffffffffff 0000000001f3fb0e 0000000000000001 > GPR08: 800a00066b9e7d80 800a00066b9e7d80 0000000000000001 c00800000cd5db88 > r9 > GPR12: c000000000029140 c000203fff6b9800 0000000000000008 c00800000d5751e0 > GPR16: c000203985a8a6c0 c00800000e013278 c000203985a8a700 0000000000000038 > GPR20: 0000000000000030 0000000000000028 0000000000000020 fffffffffffff000 > GPR24: 0000000000000001 0000000000000400 0000000000000000 0000000000000023 > GPR28: 0000000000000000 0000000000000000 c00020391ccc38c8 c000003ef1b00000 > [ 7101.930544] NIP [c00000000067b230] blk_mq_get_request+0x260/0x4b0 > [ 7101.930557] LR [c00000000067b1d4] blk_mq_get_request+0x204/0x4b0 > [ 7101.930569] Call Trace: > [ 7101.930577] [c00020391ccc3850] [c00000000067b1d4] blk_mq_get_request+0x204/0x4b0 (unreliable) > [ 7101.930594] [c00020391ccc38a0] [c00000000067b688] blk_mq_alloc_request_hctx+0x108/0x1b0 > [ 7101.930617] [c00020391ccc3910] [c00800000cd51aac] nvme_alloc_request+0x54/0xe0 [nvme_core] > [ 7101.930633] [c00020391ccc3940] [c00800000cd5641c] __nvme_submit_sync_cmd+0x64/0x290 [nvme_core] > [ 7101.930651] [c00020391ccc39c0] [c00800000d571650] nvmf_connect_io_queue+0x148/0x1e0 [nvme_fabrics] > [ 7101.930668] [c00020391ccc3ab0] [c00800000e0106b0] nvme_loop_connect_io_queues+0x98/0xf8 [nvme_loop] > [ 7101.930684] [c00020391ccc3af0] [c00800000e01116c] nvme_loop_create_ctrl+0x434/0x6a0 [nvme_loop] > [ 7101.930700] [c00020391ccc3bd0] [c00800000d5724f0] nvmf_dev_write+0xd38/0x124c [nvme_fabrics] > [ 7101.930719] [c00020391ccc3d60] [c000000000421e58] __vfs_write+0x38/0x70 > [ 7101.930731] [c00020391ccc3d80] [c000000000426188] vfs_write+0xd8/0x250 > [ 7101.930744] [c00020391ccc3dd0] [c000000000426558] ksys_write+0x78/0x130 > [ 7101.930758] [c00020391ccc3e20] [c00000000000bde4] system_call+0x5c/0x70 > > And then the disassembly: > > x = op_is_sync(op) > r27 = op = 0x0000000000000023 > c00000000067b1e0: 3e 06 67 57 clrlwi r7,r27,24 # r7 = r27 & REQ_OP_MASK > > c00000000067b1e4: 00 00 07 2c cmpwi r7,0 # if r7 == REQ_OP_READ, x = 1 then goto label2 > x = 1 > c00000000067b1e8: 01 00 20 39 li r9,1 > > ... > > r30 = data = c00020391ccc38c8 > r8 = data->ctx = 800a00066b9e7d80 > c00000000067b208: 18 00 1e e9 ld r8,24(r30) > c00000000067b20c: 18 00 82 41 beq c00000000067b224 <blk_mq_get_request+0x254> -> label2 > > (op & (REQ_SYNC | REQ_FUA | REQ_PREFLUSH)) > c00000000067b210: 1c 05 6a 57 rlwinm r10,r27,0,20,14 > c00000000067b214: 68 03 4a 55 rlwinm r10,r10,0,13,20 > c00000000067b218: 34 00 4a 7d cntlzw r10,r10 > c00000000067b21c: 7e d9 4a 55 rlwinm r10,r10,27,5,31 > c00000000067b220: 01 00 49 69 xori r9,r10,1 > > c00000000067b224: 24 1f 29 79 rldicr r9,r9,3,60 <- label2 > r9 = x * 8 > > c00000000067b228: 01 00 e0 38 li r7,1 # for refcount_set > > r9 = data->ctx + x > c00000000067b22c: 14 4a 28 7d add r9,r8,r9 > > data->ctx->rq_dispatched[op_is_sync(op)]++; > > r10 = data->ctx->rq_dispatched[x] > c00000000067b230: a8 00 49 e9 ld r10,168(r9) <- NIP > > x++ > c00000000067b234: 01 00 4a 39 addi r10,r10,1 > > data->ctx->rq_dispatched[x] = r10 > c00000000067b238: a8 00 49 f9 std r10,168(r9) > > refcount_set(&rq->ref, 1); > c00000000067b23c: d4 00 ff 90 stw r7,212(r31) > > > So we're oopsing at data->ctx->rq_dispatched[op_is_sync(op)]++. > > data->ctx looks completely bogus, ie. 800a00066b9e7d80, that's not > anything like a valid kernel address. > > And also op doesn't look like a valid op value, it's 0x23, which has no > flag bits set, but also doesn't match any of the values in req_opf. > > So I suspect data is pointing somewhere bogus. Or possibly it used to > point at a blk_mq_alloc_data but doesn't anymore. > > Why that's happened I have no idea. I can't see any obvious commits in > mainline or stable that mention anything similar, maybe someone on > linux-block recognises it? > > cheers Please try: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-5.4/block&id=556f36e90dbe7dded81f4fac084d2bc8a2458330 Strictly speaking, it is still a workaround, but it works in case that CPU hotplug isn't involved. Thanks, Ming Lei