Re: io_uring failure on parisc with VIPT caches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2023-02-16 3:24 a.m., Helge Deller wrote:
On 2/16/23 03:50, Jens Axboe wrote:
On 2/15/23 7:40 PM, John David Anglin wrote:
On 2023-02-15 6:02 p.m., Jens Axboe wrote:
This is not related to Helge's patch, 6.1-stable is just still missing:

commit fcc926bb857949dbfa51a7d95f3f5ebc657f198c
Author: Jens Axboe<axboe@xxxxxxxxx>
Date:   Fri Jan 27 09:28:13 2023 -0700

      io_uring: add a conditional reschedule to the IOPOLL cancelation loop

and I'm guessing you're running without preempt.
With 6.2.0-rc8+, I had a different crash running poll-race-mshot.t:

Backtrace:


Kernel Fault: Code=15 (Data TLB miss fault) at addr 0000000000000000
CPU: 0 PID: 18265 Comm: poll-race-mshot Not tainted 6.2.0-rc8+ #1
Hardware name: 9000/800/rp3440

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00010000001001001001000111110000 Not tainted
r00-03  00000000102491f0 ffffffffffffffff 000000004020307c ffffffffffffffff
r04-07  ffffffffffffffff ffffffffffffffff ffffffffffffffff ffffffffffffffff
r08-11  ffffffffffffffff 000000000407ef28 000000000407f838 8400000000800000
r12-15  0000000000000000 0000000040c424e0 0000000040c424e0 0000000040c424e0
r16-19  000000000407fd68 0000000063f08648 0000000040c424e0 000000000a085000
r20-23  00000000000d6b44 000000002faf0800 00000000000000ff 0000000000000002
r24-27  000000000407fa30 000000000407fd68 0000000000000000 0000000040c1e4e0
r28-31  400000000000de84 0000000000000000 0000000000000000 0000000000000002
sr00-03  0000000004081000 0000000000000000 0000000000000000 0000000004081de0
sr04-07  0000000004081000 0000000000000000 0000000000000000 00000000040815a8

IASQ: 0000000004081000 0000000000000000 IAOQ: 0000000000000000 0000000004081590
  IIR: 00000000    ISR: 0000000000000000  IOR: 0000000000000000
  CPU:        0   CR30: 000000004daf5700 CR31: ffffffffffffefff
  ORIG_R28: 0000000000000000
  IAOQ[0]: 0x0
  IAOQ[1]: linear_quiesce+0x0/0x18 [linear]
  RP(r2): intr_check_sig+0x0/0x3c
Backtrace:

Kernel panic - not syncing: Kernel Fault

This means very little to me, is it a NULL pointer deref? And where's
the backtrace?

I see iopoll.t triggering the kernel to hang on 32-bit kernel.
System gets unresponsive, bug with sysrq-l I get:

[  880.020641] sysrq: Show backtrace of all active CPUs
[  880.024123] sysrq: CPU0:
[  880.024123] CPU: 0 PID: 7549 Comm: kworker/u32:7 Not tainted 6.1.12-32bit+ #1595
[  880.024123] Hardware name: 9000/785/C3700
[  880.024123] Workqueue: events_unbound io_ring_exit_work
[  880.024123]
[  880.024123]      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
[  880.024123] PSW: 00000000000011001111111100001111 Not tainted
[  880.024123] r00-03  000cff0f 19610540 104f7b70 19610540
[  880.024123] r04-07  1921a278 00000000 192c8400 1921b508
[  880.024123] r08-11  00000003 0000002e 195fd050 00000004
[  880.024123] r12-15  192c8710 10a77000 00000000 00002000
[  880.024123] r16-19  1921a210 1240c000 1240c060 1924aff0
[  880.024123] r20-23  00000002 00000000 104b4384 00000020
[  880.024123] r24-27  00000003 19610548 1921a210 10aba968
[  880.024123] r28-31  1094f5c0 0000000e 196105c0 104f7b70
[  880.024123] sr00-03  00000000 00001695 00000000 00001695
[  880.024123] sr04-07  00000000 00000000 00000000 00000000
[  880.024123]
[  880.024123] IASQ: 00000000 00000000 IAOQ: 104f7b6c 104b4384
[  880.024123]  IIR: 081f0242    ISR: 00002000  IOR: 00000000
[  880.024123]  CPU:        0   CR30: 195fd050 CR31: d237ffff
[  880.024123]  ORIG_R28: 00000000
[  880.024123]  IAOQ[0]: io_do_iopoll+0xb4/0x3a4
[  880.024123]  IAOQ[1]: iocb_bio_iopoll+0x0/0x50
[  880.024123]  RP(r2): io_do_iopoll+0xb8/0x3a4
[  880.024123] Backtrace:
[  880.024123]  [<1092a2b0>] io_uring_try_cancel_requests+0x184/0x3b0
[  880.024123]  [<1092a57c>] io_ring_exit_work+0xa0/0x4c4
[  880.024123]  [<101cb448>] process_one_work+0x1c4/0x3cc
[  880.024123]  [<101cb7d8>] worker_thread+0x188/0x4b4
[  880.024123]  [<101d5910>] kthread+0xec/0xf4
[  880.024123]  [<1018801c>] ret_from_kernel_thread+0x1c/0x24
I had updated to 6.2.0-rc8+ to avoid this issue.

I agree there's not a lot of helpful info in the dump.  Somehow, the code has branched to
location 0 and attempted to execute instruction 0.  RP points at intr_check_sig but not to
a valid return point for a call instruction.  In the dump above, SP is 0.  Maybe the stack
overflowed for the process?

I have run the test multiple times by itself.  It consistently generates a HPMC check.  The PIM
dump provides no more info than the above dump (i.e., kernel has tried to execute location 0).
It didn't appear SP had been clobbered in the PIM dump that I looked at.

Running the test under strace gives different points where the trace stops:

io_uring_setup(64, {flags=0, sq_thread_cpu=0, sq_thread_idle=0, sq_entries=64, cq_entries=128, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|0x1f80, sq_off={head=0, tail=16, ring_mask=64, ring_entries=72, flags=84, dropped=80, array=2144}, cq_off={head=32, tail=48, ring_mask=68, ring_entries=76, overflow=92, cqes=96, flags=0x58 /* IORING_CQ_??? */}}) = 3

io_uring_enter(3, 64, 0, 0, NULL, 8)    = 64

io_uring_setup(64, {flags=0, sq_thread_cpu=0, sq_thread_idle=0, sq_entries=64, cq_entries=128, features=IORING_FEAT_SINGLE_MMAP|IORING_FEAT_NODROP|IORING_FEAT_SUBMIT_STABLE|IORING_FEAT_RW_CUR_POS|IORING_FEAT_CUR_PERSONALITY|IORING_FEAT_FAST_POLL|IORING_FEAT_POLL_32BITS|0x1f80, sq_off={head=0, tail=16, ring_mask=64, ring_entries=72, flags=84, dropped=80, array=2144}, cq_off={head=32, tail=48, ring_mask=68, ring_entries=76, overflow=92, cqes=96, flags=0x58 /* IORING_CQ_??? */}}) = 3
mmap2(NULL, 2400, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 3, 0) = 0xf8cad000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_POPULATE, 3, 0x10000000

--
John David Anglin  dave.anglin@xxxxxxxx




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux